Multimodal Sensor Fusion for Non-Destructive Tea Quality Evaluation: Deep Learning-Enabled Methods, Applications, and Challenges

Hu, Xinyu; Zhang, Meng; Yang, Biyue; Tao, Yuefei; Wei, Wei

doi:10.3390/foods15101810

Open AccessReview

Multimodal Sensor Fusion for Non-Destructive Tea Quality Evaluation: Deep Learning-Enabled Methods, Applications, and Challenges

by

Xinyu Hu

,

Meng Zhang

,

Biyue Yang

,

Yuefei Tao

and

Wei Wei

^*

School of Agricultural Engineering, Jiangsu University, Zhenjiang 212013, China

^*

Author to whom correspondence should be addressed.

Foods 2026, 15(10), 1810; https://doi.org/10.3390/foods15101810

Submission received: 14 April 2026 / Revised: 13 May 2026 / Accepted: 19 May 2026 / Published: 20 May 2026

(This article belongs to the Topic Multidisciplinary Advances in Tea Science: Smart Cultivation, Digital Processing, and Health Innovation)

Download

Browse Figures

Versions Notes

Abstract

Tea quality evaluation is increasingly moving from subjective sensory assessment and destructive laboratory analysis toward rapid, non-destructive, and data-driven approaches. This review summarizes recent advances in multimodal sensing integrated with deep learning for tea quality evaluation, with emphasis on sensor complementarity, data-fusion strategies, representative applications, and deployment-related limitations. Major sensing modalities, including machine vision, near- and mid-infrared spectroscopy, Raman and fluorescence spectroscopy, hyperspectral imaging, and electronic nose/electronic tongue systems, are discussed in relation to their ability to characterize appearance, chemical composition, aroma, flavor, processing status, and safety-related attributes. Applications are examined for quality grading, chemical composition prediction, aroma and flavor characterization, fermentation monitoring, and safety-related extensions across representative tea products, including green tea, black tea, dark tea, matcha, and jasmine tea. Overall, multimodal approaches can outperform single-sensor systems only when the selected modalities provide complementary, rather than redundant, information layers. However, practical translation remains constrained by small and weakly standardized datasets, insufficient external validation, sensor instability, limited model transferability, high computational cost, and insufficient interpretability. Future research should prioritize standardized datasets, leakage-free validation protocols, interpretable multimodal modeling, truly independent external validation, interoperable multi-sensor platforms, and lightweight deployable models.

Keywords:

tea quality; non-destructive evaluation; multimodal sensing; deep learning; data fusion; sensor complementarity

1. Introduction

Tea is one of the most widely consumed beverages in the world, and its quality and safety directly influence consumer acceptance and public health. Traditionally, tea evaluation has relied on sensory inspection by trained assessors and on laboratory-based physicochemical analysis. Sensory assessment considers appearance, aroma, and taste, but its reproducibility is often limited by assessor experience and subjectivity [1,2]. Instrumental methods such as gas chromatography–mass spectrometry (GC–MS) provide accurate chemical information, yet they are often time-consuming, costly, and destructive. For example, aroma profiling in black tea typically depends on GC–MS-based identification and quantification, whereas the determination of bioactive compounds in green tea often requires labor-intensive extraction and wet-chemistry procedures [3,4]. These limitations make conventional approaches increasingly inadequate for a modern tea industry that requires rapid, objective, and non-destructive quality control.

With recent advances in sensor technology and artificial intelligence, multimodal sensing has emerged as a promising route for tea quality assessment [5,6,7]. In this context, multimodal sensing refers to the acquisition of complementary information from multiple sensor types, followed by data integration to characterize tea quality more comprehensively [8,9]. Common modalities include machine vision, which captures appearance-related traits such as shape, size, and color uniformity—for example, by converting RGB images into HSV or Lab space to quantify browning during fermentation [10]; spectroscopic methods, including near-infrared (NIR), mid-infrared, Raman, and fluorescence spectroscopy, which probe chemical composition and molecular structure and enable rapid prediction of constituents such as polyphenols and amino acids [11,12,13]; and electronic nose/electronic tongue systems, which emulate human olfactory and gustatory perception through sensor arrays for aroma, flavor, origin, and grade discrimination [12,14,15]. When these complementary data streams are coupled with modern pattern-recognition algorithms, they enable more objective and robust tea evaluation.

Deep learning has become a central tool in intelligent food-quality analysis [16,17,18]. By learning hierarchical nonlinear representations from raw or minimally processed data, deep models can outperform conventional empirical indices and shallow machine-learning approaches. Convolutional neural networks (CNNs) are particularly effective for image-based tea grading [19], whereas one-dimensional CNNs and recurrent neural networks (RNNs) are well suited to spectral or temporal sensor signals, such as fermentation dynamics and flavor evolution. Attention mechanisms and residual architectures can further improve feature selection and model stability under limited-sample conditions [20]. Nevertheless, model performance still depends heavily on dataset quality and representativeness, underscoring the need for larger, publicly accessible datasets covering multiple tea types, origins, and processing conditions.

Several reviews have addressed tea quality evaluation from the perspectives of analytical techniques, sensory methods, or specific sensing tools. More recently, Wu et al. [21] systematically reviewed deep-learning applications in tea quality monitoring across cultivation, processing, and product evaluation. In contrast, the present review focuses specifically on multimodal, non-destructive tea quality evaluation, with particular emphasis on sensor complementarity, fusion strategies, representative tea products, and deployment-related challenges. We reorganize the literature according to a quality-attribute-driven framework that links target quality attributes, target substance groups, sensing modalities, fusion strategies, deep-learning architectures, and deployment constraints. We aim to clarify why multimodal sensing can outperform single-sensor systems and what conditions are required for its industrial translation.

2. Framework of Multimodal Sensing and Deep Learning for Tea Quality Evaluation

2.1. Tea Quality Attributes and Target Substance Groups

Tea quality is inherently a multidimensional concept and cannot be fully characterized by a single sensory attribute or a single physicochemical parameter. For finished tea products, appearance, color, liquor color, aroma, taste, mouthfeel, processing status, and key chemical composition jointly determine overall quality performance. The major target substance groups directly associated with these quality attributes include tea polyphenols and catechins, caffeine, free amino acids, particularly theanine, soluble sugars, pigments, volatile organic compounds (VOCs), and moisture-related indicators. For fermented or post-fermented teas, such as black tea and dark tea, dynamic changes in theaflavins, thearubigins and their oxidative polymerization products, fermentation-derived volatiles, and taste-active compounds are also important bases for evaluating fermentation degree and final product quality [12,13,14].

From the perspective of multimodal sensing, tea quality attributes can be mapped onto several interrelated but non-equivalent information dimensions. Appearance, leaf shape, color, and uniformity are mainly reflected in visual and textural information; changes in polyphenols, caffeine, amino acids, moisture, and pigments are more directly associated with molecular responses involving absorption, scattering, fluorescence, or Raman signals; aroma quality mainly depends on VOC fingerprints; taste is closely related to soluble taste-active compounds in tea infusion and their electrochemical response patterns. Processing states, such as fixation, rolling, fermentation, drying, scenting, and aging, are manifested as the synchronous evolution of visual, spectral, gaseous, and environmental parameters [19,20]. Therefore, the premise of multimodal tea evaluation is not merely the simultaneous use of multiple sensors, but rather the prior clarification of the correspondence among target quality attributes, target substance groups, measurable information dimensions, and sensing modalities. In the strict sense, non-destructive detection should refer to analytical methods that complete measurement without compromising the original integrity of the sample. However, some spectroscopic, electrochemical, or enhanced sensing methods in tea research still require grinding, homogenization, infusion preparation, surface-enhanced substrates, or other pretreatment steps. Therefore, this review distinguishes strictly non-destructive methods from minimally destructive or sample-preparation-assisted rapid sensing approaches when necessary (Table 1).

Table 1. Mapping of Tea Quality Attributes, Target Indicators, Sensing Modalities, and Fusion Strategies.

Quality Attribute	Target Indicators	Dominant Modalities	Complementary Modalities	Suitable Fusion Strategies	Main Limitations
Appearance and color	Leaf shape/strip appearance, particle size, color uniformity, liquor color, surface defects	Machine vision, microscopic imaging	HSI/MSI, GLCM texture features	Early/intermediate fusion	Sensitive to illumination, background, and sample stacking
Intrinsic chemical composition	Tea polyphenols, catechins, caffeine, free amino acids, soluble sugars, moisture	NIR/MIR	Raman/SERS, HSI, fluorescence	Intermediate fusion	Difficult cross-instrument transfer and pronounced matrix effects
Aroma quality	VOC fingerprints, floral/fresh/roasted aroma, scenting intensity	Electronic nose, colorimetric sensor array, GC–IMS	GC–MS, Vis-NIR/HSI	Late/intermediate fusion	Sensor drift and insufficient specificity for low-abundance key aroma compounds
Taste and mouthfeel	Freshness/umami, bitterness, astringency, mellow mouthfeel, soluble taste-active compounds	Electronic tongue, electrochemical sensor array	NIR, FT-NIR, reference chemical analysis	Intermediate/late fusion	Electrode fouling and strong matrix effects in tea infusion
Processing status	Fixation/drying endpoint, fermentation degree, aging stage	HSI, NIR, electronic nose	IoT-based environmental sensing, machine vision	Temporal intermediate/late fusion	Difficulty in continuous annotation and high requirements for temporal synchronization
Safety-related extensions	Screening of pesticide residues, contaminants, adulteration, or abnormal samples	SERS, fluorescence, HSI	NIR, imaging, mass spectrometry confirmation	Late fusion	Diverse targets, insufficient databases, and some methods are not strictly non-destructive

HSI, hyperspectral imaging; MSI, multispectral imaging; GLCM, gray-level co-occurrence matrix; NIR, near-infrared spectroscopy; MIR, mid-infrared spectroscopy; SERS, surface-enhanced Raman spectroscopy; VOCs, volatile organic compounds; GC–IMS, gas chromatography–ion mobility spectrometry; GC–MS, gas chromatography–mass spectrometry; Vis-NIR, visible–near-infrared spectroscopy; FT-NIR, Fourier-transform near-infrared spectroscopy; IoT, Internet of Things. Fusion strategies listed in the table indicate recommended or commonly applicable approaches and are not mutually exclusive.

2.2. Complementarity Among Sensing Modalities

Multimodal sensing captures different information dimensions of the same sample through multiple sensors, thereby providing a more comprehensive representation than a single modality [20]. Compared with single-modal approaches, the advantage of multimodal methods lies not merely in “collecting more data”, but in the fact that different sensors correspond to different layers of quality-related information. As a result, they can form complementary relationships in complex samples, thereby improving quality characterization, model robustness, and cross-scenario adaptability [8,9,20].

Machine-vision-based modalities mainly characterize external attributes, including leaf shape, strip morphology, particle size, color uniformity, liquor color, and surface defects [19,22]. Their strength lies in rapid, non-contact acquisition and strong interpretability for grading tasks. When image features are extended from simple RGB color descriptors to HSV or Lab representations, or further to statistical texture descriptors, visual sensing becomes more sensitive to structural heterogeneity and subtle process-induced changes. In this context, GLCM-based texture descriptors can serve as interpretable complements to deep visual features, especially for strip tea, granular tea, and matcha powder, where surface morphology contributes materially to quality evaluation [23,24].

Spectroscopic methods mainly characterize internal chemical composition and molecular structure. NIR and MIR are most commonly used in rapid quantitative tea analysis to predict moisture, tea polyphenols, caffeine, free amino acids, and other major constituents [11,12,13,25,26,27,28]. Raman spectroscopy and SERS provide more molecule-specific vibrational fingerprints and are suitable for the rapid identification of characteristic molecules or trace targets [27]. Fluorescence spectroscopy and fluorescence hyperspectral imaging can capture differences in excitation–emission responses caused by certain components or surface residues [26]. HSI integrates spatial and spectral information, enabling simultaneous characterization of “component content” and “spatial distribution”; therefore, it is particularly advantageous for visualizing chlorophyll, moisture, pigment distribution, powder uniformity, and local defects [20,25,29].

Olfactory and gustatory sensor systems represent another critical layer of complementarity. In this review, olfactory sensors mainly refer to MOS-based electronic noses, colorimetric sensor arrays, and GC–IMS-assisted volatile-fingerprint systems that capture VOC-related aroma profiles [2,14,30]. Taste sensors mainly refer to electronic tongues and electrochemical sensor arrays that reflect the electrochemical response patterns of amino acids, catechins, caffeine, soluble taste-active compounds, and related matrix effects in tea infusion [12,14,15]. These modalities are especially useful when chemical composition alone cannot adequately explain perceived aroma intensity, freshness, bitterness, astringency, or processing degree.

Therefore, the true value of multimodal systems lies in complementarity rather than redundant accumulation. Visual modalities are effective for characterizing appearance and color, but they cannot directly quantify theanine or caffeine; NIR is suitable for rapid prediction of overall chemical composition, but it is insufficiently sensitive to subtle aroma differences; Electronic noses can capture volatile profiles, but they cannot directly explain taste; Electronic tongues can reflect taste-related electrochemical patterns, but they do not provide information on leaf strip morphology or aroma purity. Precisely because these blind spots are offset across modalities, the integration of image, spectral, olfactory, and gustatory information can better approximate the comprehensive decision-making logic used in sensory evaluation and quality control (Figure 1). For example, image–spectral information fusion has been used to improve the prediction of chlorophyll distribution during tencha/matcha drying [20,29], whereas the fusion of olfactory and gustatory sensing signals can enhance the accuracy of oolong tea variety discrimination [31]. These findings indicate that the decisive condition under which multimodal approaches outperform single-sensor systems is whether the complementary modalities correspond to different but related layers of quality information.

2.3. Data Fusion Strategies for Heterogeneous Tea-Sensing Data

Early fusion refers to the direct concatenation of raw signals or manually extracted features before modeling. For example, color parameters, texture vectors, spectral variables, electronic-nose responses, and electronic-tongue signals can be combined into a unified feature matrix and then input into PLS, SVM, random forest, or neural network models [24,31]. Its advantages are straightforward implementation and suitability for small-sample studies. However, its limitations are also evident: different modalities often differ substantially in dimensionality, scale, noise level, and sampling frequency. High-dimensional spectral variables may overwhelm low-dimensional visual or sensor-array response features, causing the model to become biased toward a particular modality. Therefore, normalization, variable selection, modality weighting, and outlier control are key preprocessing steps for early fusion.

Intermediate fusion, also known as feature-level deep fusion or representation-level fusion, is currently one of the most suitable strategies for multimodal tea evaluation. Its basic principle is to first extract features using modality-specific encoders and then couple these features within a shared latent space. For example, CNNs can be used to extract spatial features from images or HSI data, 1D-CNNs to extract local spectral features, and RNNs or temporal modules to capture dynamic features during fermentation or drying. These features can then be integrated through concatenation layers, attention modules, bilinear pooling, graph-structured associations, or transformer-based cross-modal interactions [32,33,34]. This approach preserves the structural characteristics of image, spectral, and sensor-array signals while making it easier to identify, through attention weights, which modality, wavelength, region, or time point is most critical for the final prediction.

Late fusion integrates the output-layer results after each modality has been modeled separately. Typical forms include weighted averaging, voting, stacking, Bayesian ensemble methods, or rule-driven multi-level decision-making. This strategy is particularly suitable for industrial deployment, because vision systems, NIR instruments, electronic noses, and electronic tongues in factory settings are often not sampled synchronously, and their maintenance cycles and failure probabilities also differ. Under such conditions, modular late fusion is more robust than full joint training and makes it easier to maintain system operation when one modality is missing, drifting, or temporarily unavailable. At the same time, late fusion offers advantages in interpretability, because the output contribution of each submodel can be examined independently.

For multimodal tea-sensing data, future optimization of fusion strategies should not focus only on how to improve accuracy in a single experiment, but should also emphasize the engineering usability of heterogeneous data. This includes cross-modal normalization, reliability-based modality weighting, robust training under missing-modality conditions, cross-device calibration and model transfer, external validation-set design, and the incorporation of calibration transfer and instrument standardization into multimodal pipelines, thereby reducing shifts caused by differences among instruments, batches, and production origins [24,31].

2.4. Deep Learning Architectures for Multimodal Tea Evaluation

Deep learning characterizes complex nonlinear mappings between input features and quality indicators by constructing multilayer neural networks, and has become an important technical foundation for intelligent tea evaluation [16,17,18]. In tea-related applications, representative models include DNNs, CNNs, RNNs, attention mechanisms, graph neural networks, and other cross-modal deep architectures [35].

DNNs are more suitable for nonlinear classification and regression modeling based on structured feature matrices, such as the mapping between preprocessed and variable-selected spectral features and physicochemical indicators [36]. Compared with traditional linear regression or shallow machine-learning methods, DNNs can learn higher-order relationships among multiple constituents, nonlinear absorption responses, and quality labels. However, their performance is highly dependent on sample size, annotation quality, and regularization strategies. For most tea-related studies, in which sample sizes remain limited, DNNs are more appropriately used as moderately complex feature learners rather than as unconstrained networks that are simply deepened without limitation.

CNNs are particularly effective for processing images, hyperspectral data, and two-dimensional spectral maps, and have been widely applied to tea appearance grading, liquor color evaluation, defect recognition, and spatial-distribution visualization [25]. For one-dimensional spectral data, 1D-CNNs can also automatically learn local wavelength patterns, thereby reducing reliance on manually selected characteristic wavelengths. For HSI data, the core value of CNNs lies in their ability to simultaneously extract spatial and spectral information, enabling models not only to classify samples into quality grades but also to visualize the spatial distribution of chlorophyll, moisture, or other indicators at the pixel level [29].

RNNs and their variants are more suitable for processing time-series signals generated during dynamic tea processing. Tea fermentation, drying, scenting, and aging are not static state-recognition problems; rather, they are time-series processes involving continuous changes in color, aroma, moisture, and chemical composition. Therefore, RNNs, LSTMs, GRUs, or networks with temporal attention are suitable for predicting fermentation endpoints or processing stages [37,38]. If temperature and humidity, image data, VOC signals, and spectral signals are integrated into a unified temporal framework, it may also be possible to construct an approximate causal representation linking processing state to quality outcome, which is particularly important for online process control.

Attention mechanisms, graph neural networks, and lightweight modeling strategies should be regarded as necessary components of multimodal tea-sensing deployment rather than optional add-on modules. Attention mechanisms can be used to identify key wavelengths, critical imaging regions, important sensor channels, and decisive time points, thereby improving interpretability and suppressing irrelevant noise [20]. Graph neural networks are more suitable for representing structural relationships among modalities, indicators, and processing stages. For industrial applications, models must also balance lightweight deployment with interpretability. Therefore, compact convolutional networks, knowledge distillation, pruning, quantization, sparse attention, hybrid chemometric–deep learning frameworks, and interpretation strategies based on attention heatmaps or feature-contribution ranking should be incorporated as constraints during model design, rather than treated as remedial measures after deployment [19,20].

2.5. Emerging Sensor Models and Image-Feature Techniques

In addition to commonly used modalities such as machine vision, NIR/MIR, HSI, Raman spectroscopy, fluorescence, and electronic nose/electronic tongue systems, several emerging sensing platforms may further expand the representational capacity of future multimodal tea-sensing systems. First, prism-coupled surface plasmon resonance (prism-coupled SPR/PRISM) and photonic crystal fiber surface plasmon resonance sensing (PCF-SPR) show potential for refractive-index change detection and highly sensitive interfacial response measurement. Conventional prism-coupled SPR can be traced back to the classical optical excitation of surface plasmons based on frustrated total reflection [39], whereas PCF-SPR introduces surface plasmon resonance responses into microstructured optical fiber platforms, thereby further expanding the design space of SPR sensors in terms of compactness, coupling flexibility, and highly sensitive interfacial detection [40]. Although these platforms have not yet become mainstream tools for tea quality evaluation, they are worth being prospectively included in this review for the coupled detection of specific markers, adulterants, contaminants, or biochemical recognition elements.

IoT-based sensor models provide an engineering pathway for tea evaluation to move from “single offline measurement” toward “online continuous monitoring”. In tea processing, temperature, humidity, airflow, images, color, VOCs, and near-infrared signals essentially constitute a multisource spatiotemporal system rather than a set of independent one-time measurements. An IoT-based tea fermentation monitoring system integrating Raspberry Pi, cameras, and wireless transmission has already been applied in tea-factory scenarios for the recognition of fermentation stages in black-tea-type processing, and real-time discrimination performance was improved through CNNs and majority voting [41]. Such studies indicate that the value of IoT platforms lies not in replacing a single laboratory instrument, but in integrating dispersed visual, environmental, and quality-related signals into an online monitoring network that can be accessed in real time.

In addition, GLCM and its combination with LBP and color-space features should be regarded as interpretable visual features that remain valuable in the deep-learning era. Ramola et al. systematically summarized the applicable boundaries of statistical texture-analysis methods, including GLCM, LBP, and ACF, and indicated that GLCM remains one of the most robust and interpretable methods for surface texture analysis [23]. More importantly, Tang et al. demonstrated in tea classification that the LBP–GLCM combination can effectively extract texture features from green tea leaves at relatively low computational cost, making it suitable for automated production-line environments [24]. Therefore, for samples with pronounced differences in surface structure, such as strip-shaped tea, granular tea, and matcha powder, GLCM should not be regarded as an outdated feature replaced by CNNs, but rather as a lightweight visual-modality supplement that can be fused with deep features to improve interpretability.

2.6. Dataset Standardization, Model Generalization, and Industrial Deployment

Although multimodal sensing and deep learning have shown considerable potential for tea quality evaluation, their industrial translation is still constrained by multiple bottlenecks related to data, instruments, models, and application scenarios. The primary challenge is the lack of standardized, large-scale, and reusable multimodal tea datasets. Tea samples differ in cultivar, geographic origin, season, plucking standard, processing technology, storage history, and grade definition. In addition, the sensory scoring systems, physicochemical measurement protocols, instrument types, acquisition illumination, and sample presentation methods used across different studies are not standardized [19,20]. As a result, models reported as having “high accuracy” in different studies are often obtained under local experimental conditions rather than under truly transferable data distributions.

This inconsistency directly affects model evaluation and generalization. Small and weakly standardized datasets increase the risk that models learn acquisition conditions, batch differences, or background noise rather than robust quality features. Random split validation within the same batch, origin, or instrument can also overestimate real-world performance. Moreover, chemical indicators are usually continuous variables, whereas sensory grades are often subjective and discrete; without unified annotation standards, multimodal fusion may learn label noise rather than true modality complementarity.

Vision systems are strongly affected by illumination, background, camera angle, and sample stacking. NIR/HSI systems are influenced by light-source aging, detector drift, temperature variation, and sample presentation. Electronic noses are susceptible to humidity, memory effects, and MOS sensor drift, whereas electronic tongues face electrode fouling, matrix effects, and declining long-term repeatability [27,40]. Therefore, sensor drift correction, instrument standardization, and calibration transfer must be regarded as fundamental components of multimodal tea-sensing systems rather than post-processing options. For cross-instrument NIR/spectral models, classical multivariate instrument standardization, including subsequent PDS-related approaches, provides a methodological basis for establishing transferable calibration relationships between different instruments [41]. In the context of tea evaluation, this means that if the same quality model is to be transferred from a laboratory platform to portable devices or factory online instruments, standard samples, transfer sets, and periodic recalibration procedures must be explicitly designed.

For most tea factories or industrial laboratories, practical quality control usually relies on one or two routine methods, such as sensory evaluation, moisture or color measurement, or selected physicochemical assays, whereas GC–MS, LC–MS, and standard wet-chemistry methods mainly serve confirmatory and calibration functions [2,3,14]. A realistic industrial pathway is therefore not to replace all existing laboratory workflows, but to build a hierarchical system of rapid screening, process monitoring, and laboratory confirmation. Machine vision, portable NIR, electronic noses, and IoT sensors can support high-throughput or online monitoring, while more accurate laboratory methods can provide calibration, risk rechecking, and regulatory confirmation.

Deployable systems must balance accuracy, computational cost, and interpretability. High-accuracy deep models often require greater computing power and more complex parameter tuning than factory or portable-device environments can support. Lightweight models, edge deployment, model compression, and hybrid chemometric-deep learning frameworks should therefore be considered early. Industrial users also need to understand which wavelengths, image regions, gas-sensor channels, or electrochemical responses drive a prediction. Only interpretable, calibratable, and transferable systems are likely to move from proof-of-concept studies to routine quality control. In summary, a multimodal intelligent framework for tea quality evaluation should follow a unified logic: define quality attributes and target substance groups, select complementary sensing modalities, generate stable predictions through appropriate fusion strategies and deployable models, and support industrial translation through standardized datasets, sensor calibration, and hierarchical confirmation systems (Figure 2).

3. Applications in Representative Tea Products

3.1. Green Tea

Green tea is an unfermented tea valued for the balance between external appearance and intrinsic sensory quality, including emerald-green dry leaves, bright yellow-green liquor, fresh aroma, and brisk, umami-rich taste. Traditionally, these attributes have been assessed by experienced tea makers, which introduces unavoidable subjectivity. Multimodal sensing now provides a more objective basis for green-tea quality evaluation.

3.1.1. Appearance Quality Monitoring

Integrated machine vision and spectral sensing have been explored for real-time monitoring of green-tea processing. Lan et al. combined a miniature NIR sensor with a visible-light camera to predict moisture content during the kill-green stage in real time [42]. After comparing partial least squares (PLS), support vector machines (SVMs), and neural-network models, the best performance was obtained with a whale-optimized Elman network using mid-level fusion features, yielding a validation correlation coefficient (Rp) of 0.9984. This study demonstrates that sensor fusion can support precise thermal control and improve consistency in final leaf appearance. In another example, Li et al. investigated ultrasound-assisted partial fermentation and used a colorimetric sensor array coupled with a CNN to track dynamic changes in volatile compounds and polyphenols [43]. Under optimized conditions, polyphenol degradation exceeded 66%, and the CNN outperformed conventional multivariate calibration. Chen et al. further showed that CNN models integrating fluorescence spectral features could non-destructively distinguish green-tea samples treated with different pesticides, with a test accuracy above 99% [28]. Collectively, these studies highlight the potential of multimodal sensing for both process monitoring and safety evaluation in green-tea production.

3.1.2. Intrinsic Composition and Sensory Quality Prediction

Green-tea quality is also determined by infusion taste and chemical composition. Near-infrared (NIR) spectroscopy has therefore been widely used for rapid, non-destructive quantification of key constituents [44]. Wu et al., for example, developed a portable NIR system combined with partial least squares regression (PLSR) to predict caffeine and amino acid contents in green-tea infusion within minutes, with R² values above 0.90 [35]. Electronic-tongue systems have likewise been used to evaluate taste attributes such as umami and astringency. When cyclic-voltammetry signals were reduced by principal component analysis (PCA) and input into a back-propagation neural network (BPNN), green teas of different quality grades could be classified accurately on the basis of taste [45]. Hyperspectral imaging has also been applied to green-tea grading. Early work showed that selecting three optimal wavelengths from visible-to-shortwave-infrared data, together with texture analysis, allowed an SVM model to classify green tea into five grades with accuracies of 98% and 95% in the training and test sets, respectively [46]. Despite these promising results, most multimodal studies on green tea remain at the laboratory stage, and models based on spectral or electronic-nose data are rarely validated across large sample sets or multiple geographic origins [47,48,49]. Future work should therefore focus on larger datasets, standardized evaluation frameworks, and more rigorous external validation.

3.2. Black Tea

Black tea undergoes full fermentation, and its quality depends strongly on the fermentation degree and volatile-aroma composition [50]. Conventional black-tea evaluation still relies heavily on subjective judgment of appearance, aroma, and taste. Multimodal sensing offers a route toward more objective and reproducible quality standards.

3.2.1. Intelligent Aroma Quality Discrimination

Electronic-nose systems have been widely used to classify black-tea aroma types and grades because they can rapidly capture overall volatile fingerprints. Wang et al., for instance, combined a metal-oxide-semiconductor (MOS) e-nose with automated thermal desorption GC–MS to analyze volatile profiles in Keemun black tea of different grades, and achieved successful discrimination using partial least squares-discriminant analysis (PLS-DA) [2]. The e-nose was sensitive to changes in overall aroma intensity associated with fermentation and could distinguish low-, medium-, and high-grade samples. Its limitation, however, lies in the identification of trace compounds that define subtle aroma nuances. For this reason, e-nose screening is often complemented by chromatographic confirmation. Recent work has increasingly adopted a combined strategy of rapid e-nose screening plus chromatographic/olfactometric identification, enabling more comprehensive evaluation of premium black-tea aroma [2,3].

3.2.2. Rapid Determination of Chemical Composition

Key chemical indices of black-tea quality, such as theaflavins and thearubigins, have traditionally been determined by laborious wet-chemistry methods [51,52,53,54]. Portable spectroscopic tools, especially NIR, now offer a faster and non-destructive alternative. Handheld NIR combined with spectral preprocessing and PLSR has been used to predict theaflavin content in tea infusion, achieving calibration R² values of up to 0.94 and prediction errors below 5% [35]. Visible-light machine vision has also been used to evaluate liquor color and infused-leaf appearance by extracting Lab parameters from digital images and correlating them with sensory scores [55]. In addition, FT-NIR combined with chemometric algorithms has been used to assess both taste attributes and internal composition. A BP-AdaBoost model, for example, predicted eight taste-related compounds in black tea with prediction-set correlation coefficients above 0.76, outperforming the conventional Si-PLS model [56]. Surface-enhanced Raman spectroscopy (SERS) is another emerging tool for rapid assessment of intrinsic quality markers; AuNP-based SERS substrates coupled with a Si-GA-PLS model enabled rapid prediction of caffeine content in black tea [57]. Electronic-tongue systems have also shown value in taste evaluation. A cyclic-voltammetry electronic tongue integrated with a Si-VCPA-PLS model predicted total free amino acids with Rp = 0.84 after multi-electrode data fusion [58]. Together, these approaches illustrate the growing feasibility of continuous, non-destructive monitoring of black-tea chemistry.

3.2.3. Fermentation Process Monitoring

Fermentation is a decisive step in black-tea manufacture and is traditionally controlled on the basis of experience with time, temperature, humidity, and aroma. Multimodal sensing provides a more objective basis for monitoring and regulating this process. Electronic noses can track dynamic aroma changes released during fermentation, and signal stabilization may indicate that fermentation is approaching completion [37,59]. Raman spectroscopy has likewise been used to monitor chemical transformations such as polyphenol oxidation and pigment formation by tracking changes in characteristic bands. Although intelligent monitoring of black-tea fermentation is still at an early stage, related studies in oolong and Pu-erh tea demonstrate the feasibility of this approach through aroma-pattern recognition and polyphenol prediction using sensor arrays and deep learning [38]. A recent study on black-tea pile fermentation further developed a portable artificial olfactory system based on a printed colorimetric sensor array and a KNN-AdaBoost model, achieving 100% classification accuracy in both training and prediction sets for fermentation-stage discrimination [60]. These results underscore the potential of real-time, non-invasive sensing for intelligent control of black-tea fermentation.

3.3. Dark Tea

Dark tea, a representative post-fermented tea category, includes products such as Pu-erh from Yunnan and Anhua dark tea from Hunan. Its distinctive sensory profile is shaped by pile fermentation (wo dui) and long-term aging, resulting in reddish-brown liquor, aged aroma, and a thick, mellow mouthfeel [61,62]. Traditional evaluation depends largely on expert olfactory and gustatory judgment of fermentation degree and aging status. Multimodal sensing provides a promising route toward more objective and quantitative assessment of these complex quality traits.

3.3.1. Intelligent Identification of Aroma and Taste Qualities

During fermentation and aging, dark tea develops characteristic aged aromas and a mellow taste profile. Electronic-nose systems can capture VOC fingerprints that change across fermentation stages and storage years, enabling pattern recognition and sample classification. Electronic-tongue systems similarly mimic taste perception through liquid sensor arrays and can quantify infusion taste characteristics. For example, voltammetric e-tongue data combined with deep learning have been used to discriminate Pu-erh teas with different storage durations [38]. Ouyang et al. further integrated voltammetric e-tongue signals with a BP neural network to predict total free amino acids in dark-tea samples, achieving Rp = 0.84 [59]. These olfactory and gustatory sensing strategies provide a data-driven basis for standardizing aroma and taste evaluation in dark tea.

3.3.2. Internal Components and Quality Indicator Detection

Dark-tea quality is also closely related to chemical changes during fermentation, including polyphenol oxidation and variations in caffeine and organic-acid content. Non-destructive methods such as NIR and Raman spectroscopy therefore offer valuable tools for rapid assessment of internal quality attributes. Related FT-NIR studies coupled with chemometric modeling have shown that multiple taste-related compounds can be predicted simultaneously with correlation coefficients above 0.85 in calibration and validation sets [63]. Optical colorimetric sensing offers another useful route. Liu et al. developed a colorimetric sensor array comprising eight porphyrins and one pH indicator to monitor chemical changes during Pu-erh pile fermentation; a CNN trained directly on the resulting color images achieved a prediction correlation close to Rp = 0.87 for total polyphenol content [38]. These spectral and colorimetric strategies provide efficient and objective tools for monitoring compositional change and supporting standardized quality evaluation in dark tea.

3.3.3. Fermentation and Aging Process Monitoring

Pile fermentation is a key determinant of dark-tea quality. Traditionally, producers judge fermentation progress and endpoint from empirical cues such as heap temperature and aroma, which can introduce inconsistency. Multimodal sensing offers a basis for real-time monitoring and more precise process control. Sharmilan et al., for example, developed an artificial olfactory system for tea fermentation using an array of MOS gas sensors to capture odor characteristics at different stages; when combined with machine-learning algorithms, the system achieved nearly 100% accuracy in stage classification [64]. Although this study was not limited to dark tea, it illustrates the feasibility of continuously tracking odor evolution to infer fermentation degree and optimize process endpoints. As sensor stability improves and fermentation datasets accumulate, intelligent online monitoring of dark-tea fermentation and aging should become increasingly practical.

Overall, multimodal sensing can convert traditionally experience-based evaluation of dark-tea fermentation and post-fermentation aging into quantitative, data-driven monitoring. This transition is important for standardizing product quality and improving process control in large-scale production.

3.4. Matcha

Matcha is produced by steaming and drying tender tea leaves to obtain tencha, which is then milled into a fine powder. Its quality is determined not only by sensory traits such as bright green color, fresh aroma, and umami taste, but also by the content and spatial uniformity of key components, including chlorophyll and amino acids [65,66].

3.4.1. Visualized Analysis of Color and Composition

Hyperspectral imaging (HSI) is particularly attractive for matcha because it simultaneously acquires spectral and spatial information, enabling visualization of the distribution of quality-related variables. Ouyang et al. used hyperspectral microscopic imaging (HMI) in the 400–998 nm range to predict color and physicochemical indices in matcha samples [23]. After feature selection by competitive adaptive reweighted sampling (CARS) and random forest, combined with PLSR modeling, they achieved accurate prediction of total chlorophyll content and chromaticity, with a best prediction correlation of Rp = 0.8093. Pixel-wise application of the optimized model generated full-field maps of chlorophyll and other indicators, revealing spatial heterogeneity in color and composition and thereby providing information relevant to grinding efficiency and product consistency. VNIR hyperspectral approaches have also been used for simultaneous quantification of caffeine, tea polyphenols, and free amino acids using standard normal variate (SNV) preprocessing, CARS/BOSS feature selection, and PLS modeling [67]. These studies demonstrate the promise of hyperspectral methods for objective matcha-quality analysis.

3.4.2. Volatile Aroma and Grade Identification

Traditional assessment of matcha aroma lacks rapid and quantitative tools [68]. To address this gap, researchers have developed odor-sensing systems for grade discrimination. Ouyang et al. designed a ZIF-8-assisted nano colorimetric sensor array to capture aroma fingerprints of matcha and combined it with density functional theory (DFT) to interpret the sensing mechanism [69]. By incorporating pH indicators and metalloporphyrins within ZIF-8, the array achieved stronger selective adsorption of VOCs. Compared with the unmodified array, the ZIF-8-enhanced platform increased color-response intensity by 1.13–4.75 times and improved classification performance; among the tested algorithms, a BP-ANN achieved the best prediction accuracy (95%), representing a 7.5% improvement over the conventional array. DFT analysis supported the high affinity of the ZIF-8 porous structure for key aroma compounds. In another study, Zhang et al. developed a COF@MOF-based colorimetric sensor combined with AI-assisted analysis to visualize VOC changes during matcha drying, with 95.74% accuracy for the identification of seven drying stages under 20–90% relative humidity [70]. These results highlight the value of nanomaterial-enhanced sensing arrays and intelligent algorithms for rapid aroma evaluation of matcha.

3.4.3. Quality Monitoring During Processing

The drying stage of tencha strongly influences the color and nutritional quality of the final matcha product. Whereas drying has traditionally been judged visually, multimodal sensing now enables real-time monitoring of key indicators. One study combined visible–near-infrared (Vis-NIR) spectroscopy with a colorimetric sensor array to classify matcha grades based on VOC-induced color responses [71]. This non-invasive strategy supports continuous monitoring of moisture and aroma-related compounds during processing. Portable NIR systems have also been used to monitor carotenoid content during drying; after variable selection and PLS modeling, one study achieved Rp = 0.9592, confirming the feasibility of tracking pigment precursors associated with color and aroma [35]. Likewise, Vis-NIR spectroscopy fused with colorimetric-sensor-array data has been used to evaluate aroma quality during tencha drying, achieving classification accuracies of 94.68% and 93.48% in the training and prediction sets, respectively, while also identifying key VOCs such as pentanal [72]. These studies demonstrate the potential of multimodal sensing for real-time regulation of moisture, pigments, and aroma during matcha processing.

From raw-material drying to finished-product grading and spatial visualization of powder quality, multimodal sensing has been applied across multiple stages of matcha quality control. These studies provide an increasingly objective basis for matcha evaluation and lay the groundwork for more intelligent production. In addition, physicochemical indicators in matcha show strong associations with sensory quality, suggesting that integrated models based on selected key variables can support effective sensory evaluation [26,73]. Continued advances in sensor design and machine-learning algorithms should further improve the precision and practicality of matcha-quality monitoring.

3.5. Jasmine Tea

Jasmine tea is a reprocessed scented tea produced by repeatedly scenting a green- or black-tea base with fresh jasmine flowers. Its quality is judged primarily by aroma intensity and aroma purity. Traditional evaluation therefore depends heavily on expert olfactory assessment. Multimodal sensing offers a promising route toward more objective evaluation of jasmine-tea aroma quality.

3.5.1. Aroma Intensity and Purity Detection

Electronic noses have been widely used for rapid classification of jasmine-tea aroma profiles. Wang et al. combined e-nose measurements with automated thermal desorption GC–MS to compare volatile components among jasmine teas of different grades and achieved successful classification with PLS-DA [2]. The e-nose effectively distinguished differences in overall aroma intensity associated with the number of scenting rounds, such as once-, three-times-, and six-times-scented teas. However, its sensitivity to trace compounds responsible for the delicate aroma nuances of premium jasmine tea was limited [2]. This suggests that e-nose systems are well-suited to rapid screening of aroma strength, but less effective for fine discrimination of aroma type unless they are combined with more specific analytical tools such as GC–MS.

To overcome the limits of single-sensor systems, recent studies have explored multimodal strategies for jasmine-tea aroma evaluation. Gas chromatography-ion mobility spectrometry (GC–IMS), for example, can rapidly generate aroma fingerprint maps. In one study, jasmine teas subjected to one to six scenting rounds were clearly separated on the basis of their volatile fingerprints, especially after PCA [74]. GC–MS coupled with gas chromatography-olfactometry (GC–O) has further helped identify characteristic aroma compounds associated with different scenting methods. For example, single-petal jasmine-scented teas contained higher levels of α-farnesene, jasminelactone, and indole, which contributed to richer and sweeter notes, whereas double-petal jasmine teas contained more methyl benzoate, associated with a fresher floral aroma [75,76]. Such findings show that multimodal chemical profiling can provide a more detailed basis for quantitative aroma evaluation.

3.5.2. Intelligent Control of the Scenting Process

The floral character of jasmine tea is established through repeated scenting cycles in which fresh jasmine flowers are layered with the tea base. Decisions on flower dosage and scenting duration are still largely guided by artisanal experience. Multimodal sensing offers new opportunities to monitor aroma adsorption dynamically and optimize this process. One proposed strategy is to use an electronic nose after each scenting round to capture aroma-intensity profiles and feed them into a regression model that estimates the deviation from an optimal aroma target, thereby guiding whether further scenting is required [1]. Supporting this idea, An et al. used GC–MS to compare jasmine teas produced with different numbers of scenting rounds and found that characteristic compounds such as phenylethanol, jasmone, and indole increased progressively before approaching a plateau after the fifth or sixth round [77]. These compounds could therefore serve as practical indicators for rapid process monitoring. With larger datasets and more robust prediction models, intelligent control of jasmine scenting should become feasible, helping avoid both insufficient aroma uptake and over-scenting.

3.5.3. Evaluation of Appearance Quality

Besides aroma, the quality of jasmine tea also involves appearance traits such as uniformity and the presence of flower debris. Machine vision could be used to evaluate whether the dried product retains a desirable glossy green appearance and whether residual petals are excessive [12]. Although this area remains underexplored, methods developed for green and black tea image analysis can be adapted readily. Through image segmentation and extraction of color and morphological features, the visual quality of jasmine tea could be assessed more objectively against grading standards.

Table 2 summarizes the main sensing modalities, target indices, and modeling methods used for quality evaluation across green tea, black tea, dark tea, matcha, and jasmine tea. Overall, multimodal studies on jasmine tea remain limited but show clear potential. Combining rapid odor sensing with deep-learning classification may enable automated evaluation of aroma intensity and purity and support more standardized grading of scented teas.

3.6. Safety-Related Extensions: Rapid Screening of Pesticide Residues

Pesticides are widely used in tea cultivation, and excessive residues represent an important food-safety concern [78,79]. Conventional chromatographic and mass-spectrometric methods remain the reference approaches because of their sensitivity and reliability, but they are time-consuming and less suitable for high-throughput or on-site screening. Therefore, rapid sensing methods combined with chemometric or deep-learning algorithms have attracted increasing attention as complementary tools for preliminary tea-residue screening (Table 3).

Spectroscopic and spectral-imaging methods currently represent the main route for rapid pesticide-residue detection in tea. Fluorescence hyperspectral imaging combined with feature selection and 1D-CNN/random-forest modeling has been used to identify multiple pesticide residues on tea-leaf surfaces, achieving a test-set classification accuracy of 99.05% [28]. Handheld Raman or SERS platforms coupled with deep-learning models have also shown potential for on-site pesticide identification in tea matrices [80]. In addition, the fusion of NIR and SERS spectra can combine the broad compositional information provided by NIR with the high molecular specificity of SERS, thereby improving quantitative detection performance in complex samples [81].

Nanomaterial-assisted sensing further improves the sensitivity of residue detection. SERS substrates based on Au@Ag nanostructures, Au-Ag OHCs, or aptamer-assisted gold nanoparticles have been reported for the qualitative and quantitative detection of thiram, pymetrozine, imidacloprid, 2,4-D, chlorpyrifos, acetamiprid, and other pesticide residues in tea or matcha samples [82,83,84,85]. Upconversion-fluorescence and FRET-based platforms have also enabled sensitive detection of heavy-metal ions and organophosphorus pesticides such as diazinon and malathion [86,87,88]. However, pesticide detection involves highly diverse targets, complex matrix effects, and different regulatory requirements. In this review, residue detection is therefore treated as a safety-related extension rather than the central focus. Future work should emphasize standardized residue databases, miniaturized sensing devices, robust external validation, and integration with laboratory confirmatory methods.

4. Actionable Methodological Roadmap for Multimodal Tea Quality Evaluation

Although multimodal sensing and deep learning provide a promising framework for tea quality evaluation, future studies should move beyond proof-of-concept model construction and adopt more standardized, reproducible, and deployment-oriented methodological pipelines. A practical roadmap should address not only sensor selection and model construction, but also dataset curation, leakage-free validation, uncertainty assessment, model interpretation, and independent external testing.

4.1. Dataset Curation, Harmonization, and Leakage-Free Validation

The first requirement is to establish standardized and reusable multimodal tea datasets. Each tea sample should be accompanied by complete metadata, including tea type, cultivar, geographic origin, harvest year, season, plucking standard, processing batch, storage condition, and grade definition. Sensor-related metadata should also be recorded, including instrument type, spectral range, spatial or spectral resolution, illumination condition, sample presentation mode, calibration status, and acquisition protocol. For multimodal studies, all data modalities, such as RGB images, spectral curves, hyperspectral cubes, electronic-nose responses, electronic-tongue signals, and reference physicochemical measurements, should be linked through a unified sample identifier. Dataset splitting should be performed at the sample, batch, origin, or harvest-year level rather than at the pixel, spectral-replicate, or repeated-measurement level, because inappropriate splitting can lead to data leakage and overestimated model performance.

For internal validation, nested cross-validation should be recommended when preprocessing optimization, feature selection, or hyperparameter tuning is involved. In this design, the outer loop is used to estimate model generalization, whereas the inner loop is used for variable selection, model selection, and hyperparameter optimization. This strategy can reduce the bias caused by using the same data for both model selection and performance estimation [89,90]. Bootstrapping can further be used to quantify uncertainty and generate confidence intervals for performance indicators such as accuracy, R², RMSE, AUC, sensitivity, specificity, or classification error [91]. In addition, Y-randomization should be performed as a negative-control test by randomly permuting the response labels and rebuilding the model using the same modeling workflow. If the randomized models still show high performance, the original model may reflect chance correlation, data leakage, or overfitting rather than a meaningful relationship between sensor signals and tea quality attributes [92]. Therefore, future multimodal tea studies should report not only single-split performance values, but also repeated validation results, uncertainty intervals, and negative-control tests.

4.2. Model Interpretability, External Validation, and Industrial Readiness

Model interpretability should be incorporated into multimodal tea-quality modeling from the beginning rather than treated as a post hoc supplement. For spectral models, global feature-importance metrics, permutation importance, SHAP values, VIP scores, or attention weights can be used to identify key wavelengths associated with tea polyphenols, caffeine, amino acids, pigments, moisture, or other quality-related constituents [93,94]. For machine-vision and hyperspectral-imaging models, saliency maps, Grad-CAM, attention maps, or pixel-wise contribution maps can help determine whether predictions are driven by meaningful regions, such as leaf surface, liquor color, powder distribution, or local defects, rather than by background or illumination artifacts [95]. For electronic-nose and electronic-tongue systems, sensor-channel importance analysis can reveal which gas sensors or electrochemical electrodes contribute most to aroma or taste discrimination. Local explanation methods such as LIME may also be useful for interpreting individual predictions and identifying abnormal samples or modality-specific failures [96]. In multimodal models, both modality-level and feature-level contributions should be evaluated to determine whether the model truly benefits from complementary information or is dominated by a single high-dimensional modality.

External validation should be based on truly independent samples that are excluded from all modeling steps, including preprocessing, normalization, feature selection, model training, and hyperparameter tuning. Preprocessing parameters, selected wavelengths, feature subsets, scaling coefficients, and model parameters should be fitted only on the training set and then applied unchanged to the independent test set. To realistically assess generalizability, external validation should be stratified by tea type, cultivar, geographic origin, harvest year, processing batch, storage condition, and instrument platform whenever possible. Recommended validation schemes include leave-one-origin-out, leave-one-year-out, leave-one-batch-out, and leave-one-instrument-out testing. Before industrial deployment, multimodal systems should also be evaluated under realistic operating conditions, including variations in illumination, sample stacking, humidity, sensor drift, instrument aging, operator handling, and online processing speed. Only models that remain interpretable, calibrated, transferable, and robust under these independent validation conditions can be considered ready for routine industrial tea-quality control.

5. Conclusions and Perspectives

Multimodal sensing combined with deep learning is providing an increasingly powerful framework for non-destructive tea quality evaluation. Compared with conventional sensory assessment and destructive laboratory analysis, these approaches enable more objective, rapid, and information-rich characterization of tea by integrating complementary signals related to appearance, chemical composition, aroma, taste, processing status, and safety-related screening. The main contribution of this review is to clarify the role of sensor complementarity, fusion strategies, and deployable model design in representative tea products. At the same time, this field remains at a relatively early stage of translation from laboratory research to practical application. Major constraints include small and weakly standardized datasets, limited external validation across tea categories and production scenarios, insufficient sensor stability under real operating conditions, underdeveloped multimodal fusion pipelines, and persistent challenges in model interpretability, transferability, and lightweight deployment. In particular, future studies should adopt leakage-free nested validation, uncertainty estimation, Y-randomization, interpretable feature-attribution methods, and independent validation schemes stratified by variety, origin, harvest year, and processing batch.

Author Contributions

X.H., writing—original draft preparation; M.Z., investigation; B.Y., resources; Y.T., resources; W.W., writing—review and editing, conceptualization, funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD-2023-87).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT 5.2 to assist with language polishing and schematic figure drafting. All AI-assisted content was critically reviewed, edited, and approved by the authors. The authors take full responsibility for the accuracy, integrity, and originality of the final manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, C.; Zhou, C.; Tian, C.; Xu, K.; Lai, Z.; Lin, Y.; Guo, Y. Volatilomics Analysis of Jasmine Tea During Multiple Rounds of Scenting Processes. Foods 2023, 12, 812. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Zhao, F.; Wu, W.; Wang, P.; Ye, N. Comparison of Volatiles in Different Jasmine Tea Grade Samples Using Electronic Nose and Automatic Thermal Desorption-Gas Chromatography-Mass Spectrometry Followed by Multivariate Statistical Analysis. Molecules 2020, 25, 380. [Google Scholar] [CrossRef] [PubMed]
Wang, Z.; Ahmad, W.; Zhu, A.; Geng, W.; Kang, W.; Ouyang, Q.; Chen, Q. Identification of Volatile Compounds and Metabolic Pathway During Ultrasound-Assisted Kombucha Fermentation by HS-SPME-GC/MS Combined with Metabolomic Analysis. Ultrason. Sonochem. 2023, 94, 106339. [Google Scholar] [CrossRef]
Wang, D.; Gao, Q.; Wang, T.; Zhao, G.; Qian, F.; Huang, J.; Wang, H.; Zhang, X.; Wang, Y. Green tea infusion protects against alcoholic liver injury by attenuating inflammation and regulating the PI3K/Akt/eNOS pathway in C57BL/6 mice. Food Funct. 2017, 8, 3165–3177. [Google Scholar] [CrossRef]
Mao, H.; Du, X.; Yan, Y.; Zhang, X.; Ma, G.; Wang, Y.; Liu, Y.; Wang, B.; Yang, X.; Shi, Q. Highly Sensitive Detection of Daminozide Using Terahertz Metamaterial Sensors. Int. J. Agric. Biol. Eng. 2022, 15, 180–188. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, Y.; Jayan, H.; Gao, S.; Zhou, R.; Yosri, N.; Zou, X.; Guo, Z. Recent and Emerging Trends of Metal-Organic Frameworks (MOFs)-Based Sensors for Detecting Food Contaminants: A Critical and Comprehensive Review. Food Chem. 2024, 448, 139051. [Google Scholar] [CrossRef]
Lin, H.; Xu, P.-T.; Sun, L.; Bi, X.; Zhao, J.; Cai, J. Identification of Eggshell Crack Using Multiple Vibration Sensors and Correlative Information Analysis. J. Food Process Eng. 2018, 41, e12894. [Google Scholar] [CrossRef]
Qi, S.; Ouyang, Q.; Chen, Q.; Zhao, J. Real-Time Monitoring of Total Polyphenols Content in Tea Using a Developed Optical Sensors System. J. Pharm. Biomed. Anal. 2014, 97, 116–122. [Google Scholar] [CrossRef]
Ouyang, Q.; Liu, Y.; Chen, Q.; Zhang, Z.; Zhao, J.; Guo, Z.; Gu, H. Intelligent Evaluation of Color Sensory Quality of Black Tea by Visible-Near Infrared Spectroscopy Technology: A Comparison of Spectra and Color Data Information. Spectrochim. Acta Part A 2017, 180, 91–96. [Google Scholar] [CrossRef]
Hongyang, T.; Daming, H.; Xingyi, H.; Aheto, J.H.; Yi, R.; Yu, W.; Ji, L.; Shuai, N.; Mengqi, X. Detection of Browning of Fresh-Cut Potato Chips Based on Machine Vision and Electronic Nose. J. Food Process Eng. 2021, 44, e13631. [Google Scholar] [CrossRef]
Zareef, M.; Chen, Q.; Ouyang, Q.; Arslan, M.; Hassan, M.M.; Ahmad, W.; Viswadevarayalu, A.; Wang, P.; Ancheng, W. Rapid Screening of Phenolic Compounds in Congou Black Tea (Camellia sinensis) During In Vitro Fermentation Process Using Portable Spectral Analytical System Coupled Chemometrics. J. Food Process. Preserv. 2019, 43, e13996. [Google Scholar] [CrossRef]
Wang, J.; Zareef, M.; He, P.; Sun, H.; Chen, Q.; Li, H.; Ouyang, Q.; Guo, Z.; Zhang, Z.; Xu, D. Evaluation of Matcha Tea Quality Index Using Portable NIR Spectroscopy Coupled with Chemometric Algorithms. J. Sci. Food Agric. 2019, 99, 5019–5027. [Google Scholar] [CrossRef]
Zhao, S.; Adade, S.Y.-S.S.; Wang, Z.; Wu, J.; Jiao, T.; Li, H.; Chen, Q. On-Line Monitoring of Total Sugar During Kombucha Fermentation Process by near-Infrared Spectroscopy: Comparison of Linear and Non-Linear Multiple Calibration Methods. Food Chem. 2023, 423, 136208. [Google Scholar] [CrossRef]
Chen, Q.; Zhang, D.; Pan, W.; Ouyang, Q.; Li, H.; Urmila, K.; Zhao, J. Recent Developments of Green Analytical Techniques in Analysis of Tea’s Quality and Nutrition. Trends Food Sci. Technol. 2015, 43, 63–82. [Google Scholar] [CrossRef]
Yu, S.; Huang, X.; Wang, L.; Ren, Y.; Zhang, X.; Wang, Y. Characterization of Selected Chinese Soybean Paste Based on Flavor Profiles Using HS-SPME-GC/MS, E-Nose and E-Tongue Combined with Chemometrics. Food Chem. 2022, 375, 131840. [Google Scholar] [CrossRef]
Zhou, X.; Sun, J.; Tian, Y.; Lu, B.; Hang, Y.; Chen, Q. Hyperspectral Technique Combined with Deep Learning Algorithm for Detection of Compound Heavy Metals in Lettuce. Food Chem. 2020, 321, 126503. [Google Scholar] [CrossRef]
Huang, Y.; Li, Z.; Bian, Z.; Jin, H.; Zheng, G.; Hu, D.; Sun, Y.; Fan, C.; Xie, W.; Fang, H. Overview of Deep Learning and Nondestructive Detection Technology for Quality Assessment of Tomatoes. Foods 2025, 14, 286. [Google Scholar] [CrossRef]
Adade, S.Y.-S.S.; Lin, H.; Nunekpeku, X.; Johnson, N.A.N.; Agyekum, A.A.; Zhao, S.; Teye, E.; Qianqian, S.; Kwadzokpui, B.A.; Ekumah, J.-N.; et al. Flexible Paper-Based AuNP Sensor for Rapid Detection of Diabenz (a,h)Anthracene (DbA) and Benzo(b)Fluoranthene (BbF) in Mussels Coupled with Deep Learning Algorithms. Food Control 2025, 168, 110966. [Google Scholar] [CrossRef]
Zhi, S.; An, T.; Zhang, H.; Bai, Y.; Zhang, B.; Tian, G. Recent Advances and Applications of Imaging and Spectroscopy Technologies for Tea Quality Assessment: A Review. Agronomy 2025, 15, 1507. [Google Scholar] [CrossRef]
Chang, H.; Cai, J.; Ouyang, Q. Intelligent Chlorophyll Estimation by Attention-Integrated Deep Learning and Dual-Modal Fusion in Tencha Drying Using Snapshot Multispectral Camera. J. Sci. Food Agric. 2025, 105, 6737–6745. [Google Scholar] [CrossRef] [PubMed]
Wu, T.; Zhou, L.; Zhao, Y.; Qi, H.; Pu, Y.; Zhang, C.; Liu, Y. Applications of Deep Learning in Tea Quality Monitoring: A Review. Artif. Intell. Rev. 2025, 58, 342. [Google Scholar] [CrossRef]
You, J.; Li, D.; Wang, Z.; Chen, Q.; Ouyang, Q. Prediction and Visualization of Moisture Content in Tencha Drying Processes by Computer Vision and Deep Learning. J. Sci. Food Agric. 2024, 104, 5486–5494. [Google Scholar] [CrossRef] [PubMed]
Ramola, A.; Shakya, A.K.; Van Pham, D. Study of statistical methods for texture analysis and their modern evolutions. Eng. Rep. 2020, 2, e12149. [Google Scholar] [CrossRef]
Tang, Z.; Su, Y.; Er, M.J.; Qi, F.; Zhang, L.; Zhou, J. A local binary pattern based texture descriptors for classification of tea leaves. Neurocomputing 2015, 168, 1011–1023. [Google Scholar] [CrossRef]
Li, D.; Park, B.; Kang, R.; Chen, Q.; Ouyang, Q. Quantitative Prediction and Visualization of Matcha Color Physicochemical Indicators Using Hyperspectral Microscope Imaging Technology. Food Control 2024, 163, 110531. [Google Scholar] [CrossRef]
Rong, Y.; Riaz, T.; Lin, H.; Wang, Z.; Chen, Q.; Ouyang, Q. Application of Visible Near-Infrared Spectroscopy Combined with Colorimetric Sensor Array for the Aroma Quality Evaluation in Tencha Drying Process. Spectrochim. Acta Part A 2024, 304, 123385. [Google Scholar] [CrossRef]
Xu, Y.; Hassan, M.M.; Ali, S.; Li, H.; Ouyang, Q.; Chen, Q. Self-Cleaning-Mediated SERS Chip Coupled Chemometric Algorithms for Detection and Photocatalytic Degradation of Pesticides in Food. J. Agric. Food Chem. 2021, 69, 1667–1674. [Google Scholar] [CrossRef]
Sun, J.; Hu, Y.; Zou, Y.; Geng, J.; Wu, Y.; Fan, R.; Kang, Z. Identification of Pesticide Residues on Black Tea by Fluorescence Hyperspectral Technology Combined with Machine Learning. Food Sci. Technol. 2022, 42, e55822. [Google Scholar] [CrossRef]
Zhao, J.; Wang, K.; Ouyang, Q.; Chen, Q. Measurement of Chlorophyll Content and Distribution in Tea Plant’s Leaf Using Hyperspectral Imaging Technique. Spectrosc. Spectr. Anal. 2011, 31, 512–515. [Google Scholar]
Han, Z.; Ahmad, W.; Rong, Y.; Chen, X.; Zhao, S.; Yu, J.; Zheng, P.; Huang, C.; Li, H. A Gas Sensors Detection System for Real-Time Monitoring of Changes in Volatile Organic Compounds During Oolong Tea Processing. Foods 2024, 13, 1721. [Google Scholar] [CrossRef] [PubMed]
Chen, Q.; Sun, C.; Ouyang, Q.; Wang, Y.; Liu, A.; Li, H.; Zhao, J. Classification of Different Varieties of Oolong Tea Using Novel Artificial Sensing Tools and Data Fusion. LWT-Food Sci. Technol. 2015, 60, 781–787. [Google Scholar] [CrossRef]
Chen, C.; Zhu, W.; Steibel, J.; Siegford, J.; Han, J.; Norton, T. Classification of Drinking and Drinker-Playing in Pigs by a Video-Based Deep Learning Method. Biosyst. Eng. 2020, 196, 1–14. [Google Scholar] [CrossRef]
Zhou, X.; Zhao, C.; Sun, J.; Cao, Y.; Yao, K.; Xu, M. A Deep Learning Method for Predicting Lead Content in Oilseed Rape Leaves Using Fluorescence Hyperspectral Imaging. Food Chem. 2023, 409, 135251. [Google Scholar] [CrossRef]
Liu, J.; Abbas, I.; Noor, R.S. Development of Deep Learning-Based Variable Rate Agrochemical Spraying System for Targeted Weeds Control in Strawberry Crop. Agronomy 2021, 11, 1480. [Google Scholar] [CrossRef]
Li, L.; Xie, S.; Zhu, F.; Ning, J.; Chen, Q.; Zhang, Z. Colorimetric sensor array-based artificial olfactory system for sensing Chinese green tea’s quality: A method of fabrication. Int. J. Food Prop. 2017, 20, 1762–1773. [Google Scholar] [CrossRef]
Chen, Q.; Guo, Z.; Zhao, J.; Ouyang, Q. Comparisons of Different Regressions Tools in Measurement of Antioxidant Activity in Green Tea Using Near Infrared Spectroscopy. J. Pharm. Biomed. Anal. 2012, 60, 92–97. [Google Scholar] [CrossRef] [PubMed]
Tseng, T.-S.; Hsiao, M.-H.; Chen, P.-A.; Lin, S.-Y.; Chiu, S.-W.; Yao, D.-J. Utilization of a Gas-Sensing System to Discriminate Smell and to Monitor Fermentation During the Manufacture of Oolong Tea Leaves. Micromachines 2021, 12, 93. [Google Scholar] [CrossRef] [PubMed]
Liu, M.; Jiang, C.; Hassan, M.M.; Zhang, X.; Wang, R.; Cao, R.; Sheng, W.; Li, H. Investigation of Microbial Fermentation Degree of Pu-Erh Tea Using Deep Learning Coupled Colorimetric Sensor Array via Prediction of Total Polyphenols. Chemosensors 2024, 12, 265. [Google Scholar] [CrossRef]
Otto, A. Excitation of nonradiative surface plasma waves in silver by the method of frustrated total reflection. Z. Phys. A Hadron. Nucl. 1968, 216, 398–410. [Google Scholar] [CrossRef]
Liu, C.; Su, W.; Liu, Q.; Lu, X.; Wang, F.; Sun, T.; Chu, P.K. Symmetrical dual D-shape photonic crystal fibers for surface plasmon resonance sensing. Opt. Express 2018, 26, 9039–9049. [Google Scholar] [CrossRef]
Kimutai, G.; Ngenzi, A.; Rutabayiro Ngoga, S.; Ramkat, R.C.; Förster, A. An internet of things (IoT)-based optimum tea fermentation detection model using convolutional neural networks (CNNs) and majority voting techniques. J. Sens. Sens. Syst. 2021, 10, 153–162. [Google Scholar] [CrossRef]
Lan, T.; Shen, S.; Yuan, H.; Jiang, Y.; Tong, H.; Ye, Y. A Rapid Prediction Method of Moisture Content for Green Tea Fixation Based on WOA-Elman. Foods 2022, 11, 2928. [Google Scholar] [CrossRef]
Li, H.; Hu, Y.; Ma, S.; Haruna, S.A.; Chen, Q.; Zhu, W.; Xia, A. Porphyrin and pH Sensitive Dye-Based Colorimetric Sensor Array Coupled Chemometrics for Dynamic Monitoring of Tea Quality During Ultrasound-Assisted Fermentation. Microchem. J. 2024, 197, 109813. [Google Scholar] [CrossRef]
Jiang, Y.; Zareef, M.; Liu, L.; Ouyang, Q. Monitoring of Carotenoids Changes During the Matcha Drying Process Using a Portable Developed Spectral Analytical System. J. Food Compos. Anal. 2024, 125, 105849. [Google Scholar] [CrossRef]
Gharibzahedi, S.M.T.; Barba, F.J.; Zhou, J.; Wang, M.; Altintas, Z. Electronic Sensor Technologies in Monitoring Quality of Tea: A Review. Biosensors 2022, 12, 356. [Google Scholar] [CrossRef]
Zhao, J.; Chen, Q.; Cai, J.; Ouyang, Q. Automated Tea Quality Classification by Hyperspectral Imaging. Appl. Opt. 2009, 48, 3557–3564. [Google Scholar] [CrossRef]
Li, Y.; Sun, J.; Wu, X.; Lu, B.; Wu, M.; Dai, C. Grade Identification of Tieguanyin Tea Using Fluorescence Hyperspectra and Different Statistical Algorithms. J. Food Sci. 2019, 84, 2234–2241. [Google Scholar] [CrossRef]
He, F.; Wu, X.; Wu, B.; Zeng, S.; Zhu, X. Green Tea Grades Identification via Fourier Transform Near-Infrared Spectroscopy and Weighted Global Fuzzy Uncorrelated Discriminant Transform. J. Food Process Eng. 2022, 45, e14109. [Google Scholar] [CrossRef]
Li, H.; Wu, P.; Dai, J.; Pan, T.; Holmes, M.; Chen, T.; Zou, X. Discriminating Compounds Identification Based on the Innovative Sparse Representation Chemometrics to Assess the Quality of Maofeng Tea. J. Food Compos. Anal. 2023, 123, 105590. [Google Scholar] [CrossRef]
Zhao, S.; Adade, S.Y.-S.S.; Wang, Z.; Jiao, T.; Ouyang, Q.; Li, H.; Chen, Q. Deep Learning and Feature Reconstruction Assisted Vis-NIR Calibration Method for on-Line Monitoring of Key Growth Indicators During Kombucha Production. Food Chem. 2025, 463, 141411. [Google Scholar] [CrossRef] [PubMed]
Jiang, Y.; Hua, J.; Wang, B.; Yuan, H.; Ma, H. Effects of Variety, Season, and Region on Theaflavins Content of Fermented Chinese Congou Black Tea. J. Food Qual. 2018, 2018, 5427302. [Google Scholar] [CrossRef]
Guo, Z.; Barimah, A.O.; Yin, L.; Chen, Q.; Shi, J.; El-Seedi, H.R.; Zou, X. Intelligent Evaluation of Taste Constituents and Polyphenols-to-Amino Acids Ratio in Matcha Tea Powder Using Near Infrared Spectroscopy. Food Chem. 2021, 353, 129372. [Google Scholar] [CrossRef]
Chai, Z.; Tian, L.; Yu, H.; Zhang, L.; Zeng, Q.; Wu, H.; Yan, Z.; Li, D.; Hutabarat, R.P.; Huang, W. Comparison on Chemical Compositions and Antioxidant Capacities of the Green, Oolong, and Red Tea from Blueberry Leaves. Food Sci. Nutr. 2020, 8, 1688–1699. [Google Scholar] [CrossRef]
Jiang, H.; Xu, W.; Chen, Q. Determination of Tea Polyphenols in Green Tea by Homemade Color Sensitive Sensor Combined with Multivariate Analysis. Food Chem. 2020, 319, 126584. [Google Scholar] [CrossRef]
Zhou, H.; Fu, H.; Wu, X.; Wu, B.; Dai, C. Discrimination of Tea Varieties Based on FTIR Spectroscopy and an Adaptive Improved Possibilistic C-Means Clustering. J. Food Process. Preserv. 2020, 44, e14795. [Google Scholar] [CrossRef]
Chen, Q.; Chen, M.; Liu, Y.; Wu, J.; Wang, X.; Ouyang, Q.; Chen, X. Application of FT-NIR Spectroscopy for Simultaneous Estimation of Taste Quality and Taste-Related Compounds Content of Black Tea. J. Food Sci. Technol. 2018, 55, 4363–4368. [Google Scholar] [CrossRef]
Zareef, M.; Hassan, M.M.; Arslan, M.; Ahmad, W.; Ali, S.; Ouyang, Q.; Li, H.; Wu, X.; Chen, Q. Rapid Prediction of Caffeine in Tea Based on Surface-Enhanced Raman Spectroscopy Coupled Multivariate Calibration. Microchem. J. 2020, 159, 105431. [Google Scholar] [CrossRef]
Ouyang, Q.; Yang, Y.; Wu, J.; Chen, Q.; Guo, Z.; Li, H. Measurement of Total Free Amino Acids Content in Black Tea Using Electronic Tongue Technology Coupled with Chemometrics. LWT 2020, 118, 108768. [Google Scholar] [CrossRef]
Sharmilan, T.; Premarathne, I.; Wanniarachchi, I.; Kumari, S.; Wanniarachchi, D. Application of Electronic Nose to Predict the Optimum Fermentation Time for Low-country Sri Lankan Tea. J. Food Qual. 2022, 2022, 7703352. [Google Scholar] [CrossRef]
Li, H.; Zhang, B.; Hu, W.; Liu, Y.; Dong, C.; Chen, Q. Monitoring Black Tea Fermentation Using a Colorimetric Sensor Array-Based Artificial Olfaction System. J. Food Process. Preserv. 2018, 42, e13348. [Google Scholar] [CrossRef]
Lv, H.; Zhang, Y.; Lin, Z.; Liang, Y. Processing and Chemical Constituents of Pu-Erh Tea: A Review. Food Res. Int. 2013, 53, 608–618. [Google Scholar] [CrossRef]
Boateng, I.D.; Li, F.; Yang, X.-M.; Guo, D. Combinative Effect of Pulsed-Light Irradiation and Solid-State Fermentation on Ginkgolic Acids, Ginkgols, Ginkgolides, Bilobalide, Flavonoids, Product Quality and Sensory Assessment of Ginkgo biloba Dark Tea. Food Chem. 2024, 456, 139979. [Google Scholar] [CrossRef]
Liu, Z.; Xie, H.; Chen, L.; Huang, J. An Improved Weighted Partial Least Squares Method Coupled with Near Infrared Spectroscopy for Rapid Determination of Multiple Components and Anti-Oxidant Activity of Pu-Erh Tea. Molecules 2018, 23, 1058. [Google Scholar] [CrossRef] [PubMed]
Sharmilan, T.; Premarathne, I.; Wanniarachchi, I.; Kumari, S.; Wanniarachchi, D. Electronic Nose Technologies in Monitoring Black Tea Manufacturing Process. J. Sens. 2020, 11, 3073104. [Google Scholar] [CrossRef]
Liu, L.; Zareef, M.; Wang, Z.; Li, H.; Chen, Q.; Ouyang, Q. Monitoring Chlorophyll Changes During Tencha Processing Using Portable Near-Infrared Spectroscopy. Food Chem. 2023, 412, 135505. [Google Scholar] [CrossRef]
Ouyang, Q.; Wang, L.; Park, B.; Kang, R.; Wang, Z.; Chen, Q.; Guo, Z. Assessment of Matcha Sensory Quality Using Hyperspectral Microscope Imaging Technology. LWT 2020, 125, 109254. [Google Scholar] [CrossRef]
Ouyang, Q.; Wang, L.; Park, B.; Kang, R.; Chen, Q. Simultaneous Quantification of Chemical Constituents in Matcha with Visible-Near Infrared Hyperspectral Imaging Technology. Food Chem. 2021, 350, 129141. [Google Scholar] [CrossRef]
Liu, S.; Rong, Y.; Chen, Q.; Ouyang, Q. Colorimetric Sensor Array Combined with Chemometric Methods for the Assessment of Aroma Produced During the Drying of Tencha. Food Chem. 2024, 432, 137190. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Shoaib, M.; Wang, J.; Lin, H.; Chen, Q.; Ouyang, Q. A Novel ZIF-8 Mediated Nanocomposite Colorimetric Sensor Array for Rapid Identification of Matcha Grades, Validated by Density Functional Theory. J. Food Compos. Anal. 2025, 137, 106864. [Google Scholar] [CrossRef]
Ouyang, Q.; Rong, Y.; Xia, G.; Chen, Q.; Ma, Y.; Liu, Z. Integrating Humidity-Resistant and Colorimetric COF-on-MOF Sensors with Artificial Intelligence Assisted Data Analysis for Visualization of Volatile Organic Compounds Sensing. Adv. Sci. 2025, 12, e2411621. [Google Scholar] [CrossRef] [PubMed]
Ouyang, Q.; Rong, Y.; Wu, J.; Wang, Z.; Lin, H.; Chen, Q. Application of Colorimetric Sensor Array Combined with Visible Near-Infrared Spectroscopy for the Matcha Classification. Food Chem. 2023, 420, 136078. [Google Scholar] [CrossRef] [PubMed]
Ouyang, Q.; Yang, Y.; Wu, J.; Liu, Z.; Chen, X.; Dong, C.; Chen, Q.; Zhang, Z.; Guo, Z. Rapid Sensing of Total Theaflavins Content in Black Tea Using a Portable Electronic Tongue System Coupled to Efficient Variables Selection Algorithms. J. Food Compos. Anal. 2019, 75, 43–48. [Google Scholar] [CrossRef]
Wu, J.; Zareef, M.; Chen, Q.; Ouyang, Q. Application of Visible-Near Infrared Spectroscopy in Tandem with Multivariate Analysis for the Rapid Evaluation of Matcha Physicochemical Indicators. Food Chem. 2023, 421, 136185. [Google Scholar] [CrossRef] [PubMed]
Hou, Z.; Chen, Z.; Li, L.; Chen, H.; Zhang, H.; Liu, S.; Zhang, R.; Song, Q.; Chen, Y.; Su, Z.; et al. Comparison of Volatile Compounds in Jingshan Green Tea Scented with Different Flowers Using GC–IMS and GC–MS Analyses. Foods 2024, 13, 2653. [Google Scholar] [CrossRef]
Gu, M.; Zhang, Y.; Weng, Q.; Weng, W.; Ren, W.; Jin, S.; Lin, H.; Wang, P.; She, W.; Ye, N. Metabolomics Analysis Reveals Dynamic Changes of Volatile and Non-Volatile Metabolites During the Scenting Process of Jasmine Tea. Food Chem. X 2025, 28, 102617. [Google Scholar] [CrossRef]
An, H.; Ou, X.; Zhang, Y.; Li, S.; Xiong, Y.; Li, Q.; Huang, J.; Liu, Z. Study on the Key Volatile Compounds and Aroma Quality of Jasmine Tea with Different Scenting Technology. Food Chem. 2022, 385, 132718. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Gu, M.; Yang, S.; Fan, W.; Lin, H.; Jin, S.; Wang, P.; Ye, N. Dynamic Aroma Characteristics of Jasmine Tea Scented with Single-Petal Jasmine “Bijian”: A Comparative Study with Traditional Double-Petal Jasmine. Food Chem. 2025, 464, 141735. [Google Scholar] [CrossRef]
Ouyang, Q.; Wang, L.; Ahmad, W.; Rong, Y.; Li, H.; Hu, Y.; Chen, Q. A Highly Sensitive Detection of Carbendazim Pesticide in Food Based on the Upconversion-MnO2 Luminescent Resonance Energy Transfer Biosensor. Food Chem. 2021, 349, 129157. [Google Scholar] [CrossRef]
Marimuthu, M.; Xu, K.; Song, W.; Chen, Q.; Wen, H. Safeguarding Food Safety: Nanomaterials-Based Fluorescent Sensors for Pesticide Tracing. Food Chem. 2025, 463, 141288. [Google Scholar] [CrossRef]
Zhu, J.; Sharma, A.S.; Xu, J.; Xu, Y.; Jiao, T.; Ouyang, Q.; Li, H.; Chen, Q. Rapid On-Site Identification of Pesticide Residues in Tea by One-Dimensional Convolutional Neural Network Coupled with Surface-Enhanced Raman Scattering. Spectrochim. Acta Part A 2021, 246, 118994. [Google Scholar] [CrossRef]
Li, H.; Luo, X.; Haruna, S.A.; Zareef, M.; Chen, Q.; Ding, Z.; Yan, Y. Au-Ag OHCs-Based SERS Sensor Coupled with Deep Learning CNN Algorithm to Quantify Thiram and Pymetrozine in Tea. Food Chem. 2023, 428, 136798. [Google Scholar] [CrossRef]
Kang, W.; Lin, H.; Adade, S.Y.-S.S.; Wang, Z.; Ouyang, Q.; Chen, Q. Advanced Sensing of Volatile Organic Compounds in the Fermentation of Kombucha Tea Extract Enabled by Nano-Colorimetric Sensor Array Based on Density Functional Theory. Food Chem. 2023, 405, 134193. [Google Scholar] [CrossRef]
Hassan, M.M.; Li, H.; Ahmad, W.; Zareef, M.; Wang, J.; Xie, S.; Wang, P.; Ouyang, Q.; Wang, S.; Chen, Q. Au@Ag Nanostructure Based SERS Substrate for Simultaneous Determination of Pesticides Residue in Tea via Solid Phase Extraction Coupled Multivariate Calibration. LWT 2019, 105, 290–297. [Google Scholar] [CrossRef]
Zhu, J.; Agyekum, A.A.; Kutsanedzie, F.Y.H.; Li, H.; Chen, Q.; Ouyang, Q.; Jiang, H. Qualitative and Quantitative Analysis of Chlorpyrifos Residues in Tea by Surface-Enhanced Raman Spectroscopy (SERS) Combined with Chemometric Models. LWT 2018, 97, 760–769. [Google Scholar] [CrossRef]
Li, H.; Hu, W.; Hassan, M.M.; Zhang, Z.; Chen, Q. A Facile and Sensitive SERS-Based Biosensor for Colormetric Detection of Acetamiprid in Green Tea Based on Unmodified Gold Nanoparticles. J. Food Meas. Charact. 2019, 13, 259–268. [Google Scholar] [CrossRef]
Xu, Y.; Kutsanedzie, F.Y.H.; Ali, S.; Wang, P.; Li, C.; Ouyang, Q.; Li, H.; Chen, Q. Cysteamine-Mediated Upconversion Sensor for Lead Ion Detection in Food. J. Food Meas. Charact. 2021, 15, 4849–4857. [Google Scholar] [CrossRef]
Li, H.; Ali, S.; Wei, W.; Xu, Y.; Lu, H.; Hassan, M.M.; Wu, X.; Zuo, M.; Ouyang, Q.; Chen, Q. Rapid Detection of Organophosphorus in Tea Using NaY/GdF4:Yb, Er-Based Fluorescence Sensor. Microchem. J. 2020, 159, 105462. [Google Scholar] [CrossRef]
Chen, Q.; Sheng, R.; Wang, P.; Ouyang, Q.; Wang, A.; Ali, S.; Zareef, M.; Hassan, M.M. Ultra-Sensitive Detection of Malathion Residues Using FRET-Based Upconversion Fluorescence Sensor in Food. Spectrochim. Acta Part A 2020, 241, 118654. [Google Scholar] [CrossRef]
Varma, S.; Simon, R. Bias in error estimation when using cross-validation for model selection. BMC Bioinform. 2006, 7, 91. [Google Scholar] [CrossRef]
Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and subsequent selection bias in performance evaluation. J. Mach. Learn. Res. 2010, 11, 2079–2107. [Google Scholar]
Efron, B. Bootstrap methods: Another look at the jackknife. Ann. Stat. 1979, 7, 1–26. [Google Scholar] [CrossRef]
Rücker, C.; Rücker, G.; Meringer, M. y-Randomization and its variants in QSPR/QSAR. J. Chem. Inf. Model. 2007, 47, 2345–2357. [Google Scholar] [CrossRef] [PubMed]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017); NIPS Foundation: San Diego, CA, USA, 2017. [Google Scholar]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar] [CrossRef]

Figure 1. Integrated application schema of multimodal sensing and deep learning for tea quality evaluation. Abbreviations: DNN, Deep Neural Networks; CNN, Convolutional Neural Networks; RNN, Recurrent Neural Networks; GNN, Graph Neural Networks.

Figure 2. Future roadmap for multimodal intelligent tea quality and safety evaluation.

Table 2. Comparison of multimodal quality evaluation of representative tea types in terms of key indicators, sensing modalities, and modeling methods.

Tea Type	Main Quality Indicators	Sensing Modalities	Modeling Methods	References
Green Tea	Appearance uniformity and color; moisture content; polyphenols; amino acids	Machine vision; near-infrared spectroscopy; colorimetric sensor arrays	PLS regression; Elman neural network; CNN-based prediction	[37,38]
Black Tea	Aroma-compound abundance; theaflavin content; fermentation degree	Electronic nose; visible/near-infrared spectroscopy; machine vision (liquor color)	MOS-based e-nose + PLS-DA; NIR quantification of theaflavins; image-based color evaluation	[2]
Dark Tea	Aged-aroma purity and richness; taste fullness; polyphenols; caffeine; fermentation degree; post-fermentation age	Electronic nose/electronic tongue; near-infrared/Raman spectroscopy; colorimetric sensor arrays	Odor-pattern recognition; NIR + PLSR compound prediction; colorimetric array + CNN for fermentation evaluation	[38,62]
Matcha	Powder color and greenness; free amino acids; aroma quality	Hyperspectral imaging; nano-enabled colorimetric arrays; spectroscopy-based fusion sensing	HMI + PLSR for chlorophyll prediction; ZIF-8 colorimetric array + ANN for grade classification	[69]
Jasmine Tea	Aroma intensity and purity; appearance uniformity; flower-debris content	Electronic nose; GC–IMS fingerprinting; machine vision	E-nose + PLS-DA for grade discrimination; GC–IMS fingerprint analysis; image-based impurity detection	[2]

PLS, partial least squares; Elman, Elman neural network; CNN, convolutional neural network; MOS, metal oxide semiconductor; PLS-DA, partial least squares-discriminant analysis; NIR, near-infrared spectroscopy; PLSR, partial least squares regression; HMI, hyperspectral microscopic imaging; ZIF-8, zeolitic imidazolate framework-8; ANN, artificial neural network; GC–IMS, gas chromatography–ion mobility spectrometry.

Table 3. Representative multimodal and advanced sensing methods for non-destructive detection of pesticide residues in tea.

Method	Sensing Modality	Model Performance	References
Fluorescence hyperspectral imaging + CNN/RF	EEM fluorescence + 1D-CNN	Classification accuracy: 99.05% for identification of multiple pesticide residues	[28]
Handheld Raman spectroscopy + deep CNN	SERS Raman + 1D-CNN	Multi-pesticide classification accuracy > 95%	[80,81]
NIR + SERS fusion	NIR reflectance + SERS	Pesticide quantification with fused PLSR model, R² ≈ 0.99	[82]
Machine vision + electronic nose (conceptual)	Visible imaging + MOS gas sensors	Joint screening of suspicious residue spots and odor anomalies	[81]

EEM, excitation–emission matrix; 1D-CNN, one-dimensional convolutional neural network; CNN/RF, convolutional neural network with random forest; SERS, surface-enhanced Raman spectroscopy; NIR, near-infrared reflectance; PLSR, partial least squares regression; MOS, metal oxide semiconductor.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, X.; Zhang, M.; Yang, B.; Tao, Y.; Wei, W. Multimodal Sensor Fusion for Non-Destructive Tea Quality Evaluation: Deep Learning-Enabled Methods, Applications, and Challenges. Foods 2026, 15, 1810. https://doi.org/10.3390/foods15101810

AMA Style

Hu X, Zhang M, Yang B, Tao Y, Wei W. Multimodal Sensor Fusion for Non-Destructive Tea Quality Evaluation: Deep Learning-Enabled Methods, Applications, and Challenges. Foods. 2026; 15(10):1810. https://doi.org/10.3390/foods15101810

Chicago/Turabian Style

Hu, Xinyu, Meng Zhang, Biyue Yang, Yuefei Tao, and Wei Wei. 2026. "Multimodal Sensor Fusion for Non-Destructive Tea Quality Evaluation: Deep Learning-Enabled Methods, Applications, and Challenges" Foods 15, no. 10: 1810. https://doi.org/10.3390/foods15101810

APA Style

Hu, X., Zhang, M., Yang, B., Tao, Y., & Wei, W. (2026). Multimodal Sensor Fusion for Non-Destructive Tea Quality Evaluation: Deep Learning-Enabled Methods, Applications, and Challenges. Foods, 15(10), 1810. https://doi.org/10.3390/foods15101810

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multimodal Sensor Fusion for Non-Destructive Tea Quality Evaluation: Deep Learning-Enabled Methods, Applications, and Challenges

Abstract

1. Introduction

2. Framework of Multimodal Sensing and Deep Learning for Tea Quality Evaluation

2.1. Tea Quality Attributes and Target Substance Groups

2.2. Complementarity Among Sensing Modalities

2.3. Data Fusion Strategies for Heterogeneous Tea-Sensing Data

2.4. Deep Learning Architectures for Multimodal Tea Evaluation

2.5. Emerging Sensor Models and Image-Feature Techniques

2.6. Dataset Standardization, Model Generalization, and Industrial Deployment

3. Applications in Representative Tea Products

3.1. Green Tea

3.1.1. Appearance Quality Monitoring

3.1.2. Intrinsic Composition and Sensory Quality Prediction

3.2. Black Tea

3.2.1. Intelligent Aroma Quality Discrimination

3.2.2. Rapid Determination of Chemical Composition

3.2.3. Fermentation Process Monitoring

3.3. Dark Tea

3.3.1. Intelligent Identification of Aroma and Taste Qualities

3.3.2. Internal Components and Quality Indicator Detection

3.3.3. Fermentation and Aging Process Monitoring

3.4. Matcha

3.4.1. Visualized Analysis of Color and Composition

3.4.2. Volatile Aroma and Grade Identification

3.4.3. Quality Monitoring During Processing

3.5. Jasmine Tea

3.5.1. Aroma Intensity and Purity Detection

3.5.2. Intelligent Control of the Scenting Process

3.5.3. Evaluation of Appearance Quality

3.6. Safety-Related Extensions: Rapid Screening of Pesticide Residues

4. Actionable Methodological Roadmap for Multimodal Tea Quality Evaluation

4.1. Dataset Curation, Harmonization, and Leakage-Free Validation

4.2. Model Interpretability, External Validation, and Industrial Readiness

5. Conclusions and Perspectives

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI