Next Article in Journal
Application of High-Resolution Regional Climate Model Simulations for Crop Yield Estimation in Southern Brazil
Next Article in Special Issue
In-Field Forage Biomass and Quality Prediction Using Image and VIS-NIR Proximal Sensing with Machine Learning and Covariance-Based Strategies for Livestock Management in Silvopastoral Systems
Previous Article in Journal
Smart Irrigation Technologies and Prospects for Enhancing Water Use Efficiency for Sustainable Agriculture
Previous Article in Special Issue
Random Reflectance: A New Hyperspectral Data Preprocessing Method for Improving the Accuracy of Machine Learning Algorithms
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hyperspectral Imaging for the Dynamic Mapping of Total Phenolic and Flavonoid Contents in Microgreens

by
Pawita Boonrat
1,
Manish Patel
2,
Panuwat Pengphorm
3,4,
Preeyabhorn Detarun
5 and
Chalongrat Daengngam
3,4,*
1
Sustainability Technology Research Unit, Faculty of Technology and Environment, Prince of Songkla University, Phuket 83120, Thailand
2
Sapling, 17 Dulka Road, London SW1V 1AF, UK
3
Division of Physical Science, Faculty of Science, Prince of Songkla University, Hat Yai, Songkhla 90110, Thailand
4
National Astronomical Research Institute of Thailand (Public Organization), Mae Rim, Chiang Mai 50180, Thailand
5
Center of Excellence in Functional Foods and Gastronomy, Faculty of Agro-Industry, Prince of Songkla University, Hat Yai, Songkhla 90110, Thailand
*
Author to whom correspondence should be addressed.
AgriEngineering 2025, 7(4), 107; https://doi.org/10.3390/agriengineering7040107
Submission received: 24 February 2025 / Revised: 21 March 2025 / Accepted: 2 April 2025 / Published: 7 April 2025

Abstract

:
This study investigates the application of hyperspectral imaging (HSI) combined with machine learning (ML) models for the dynamic mapping of total phenolic content (TPC) and total flavonoid content (TFC) in sunflower microgreens. Spectral data were collected across different cultivation durations (Days 5, 6, and 7) to assess the secondary metabolite distribution in leaves and stems. Overall, the results indicate that TFC in leaves peaked on Day 5, followed by a decline on Days 6 and 7, while stems exhibited an opposite trend. However, TPC did not show a consistent pattern. Spectral reflectance analysis revealed higher near-infrared reflectance in leaves compared to stems. The variation in trait and spectral data among the collected samples was sufficient to develop models predicting the TPC and TFC content. K-nearest neighbours provided the highest predictive accuracy for TPC ( R 2 = 0.95 and 1.6 mg GAE/100 g) and ridge regression performed best for TFC ( R 2 = 0.97 and 6.1 mg QE/100 g). Dimensionality reduction via principal component analysis (PCA) proved effective for TPC and TFC prediction, with PC1 alone achieving performance comparable to the full spectral dataset. This integrated HSI-ML approach offers a non-destructive, real-time method for monitoring bioactive compounds, supporting sustainable agricultural practices, optimising harvest timing, and enhancing crop management. The findings can be further developed for smart microgreen farming to enable real-time secondary metabolite quantification, with future research recommended to explore other microgreen varieties for broader applicability.

1. Introduction

Microgreens are edible plants (e.g., vegetables, herbs, and flowers) collected 7–14 days after germination, when the cotyledon leaves are formed but the true leaves have not emerged [1,2,3]. Compared to their mature counterparts, microgreens contain more antioxidants, vitamins, and minerals, making them a nutrient-rich diet supplement [3,4,5,6]. Microgreens can provide health benefits to an expanding and aging population due to their ease of cultivation, high nutritional value, and high resource use efficiency in controlled environment agriculture [7,8,9]. Despite their growing popularity, more research is needed to improve cultivation methods and knowledge regarding the bio-active chemical dynamics of microgreens [10].
Secondary metabolites are multifunctional substances in plants that are essential for defense mechanisms and environmental communication; they also influence the color, taste, and aroma of plants [11,12]. Polyphenols are plant secondary metabolites with hydroxyl groups [13,14]. Phenolics and flavonoids are subcategories of polyphenols. Secondary metabolites are significantly influenced by the phenological stage of the plant, which serves as a key factor affecting their concentration [15]. The biosynthesis and accumulation of secondary metabolites in plants are processes governed by factors such as genetic traits, growing period, and environmental conditions [11,16]. These metabolites are synthesized in all parts of the plant, and their concentrations vary throughout all stages of plant growth and development. This variation is mainly due to changes in the expression of genes that regulate enzyme activity involved in the biosynthesis of phenolic compounds, which can be affected by both genetic traits and environmental factors [17,18]. Therefore, understanding the growth periods during which this pathway is most active is important for optimizing functional metabolite production [19].
Recent studies have demonstrated that flavonoids and phenols play a crucial role in managing diabetes, hypertension, and many other diseases, as well as enhancing the body’s antioxidant capacity [20,21,22]. The accurate quantification of flavonoids and phenols can help improve crop health, food quality, and agricultural research. To quantify phenolic and flavonoid content, a variety of methods have been used, including chromatography [23], spectrophotometry [24], and mass spectrometry [25]. These approaches damage the studied crop, are laborious, time-consuming, and expensive [26]. Portable spectroscopy and computer vision are faster, simpler, and less invasive. By utilising modern ML techniques, portable spectrometers can accurately measure leaf reflectance and correlate it with biochemical parameters [27]. These devices are expensive and are confined to small-scale laboratory applications, making them unsuitable for large-scale commercial monitoring. Currently, there is no real-time method for accurately assessing the biochemical composition of microgreens. Creating affordable, field-ready technologies can help farmers improve agricultural quality and sustainability by giving them access to more granular data to make more targeted choices of fertilizer spraying, crop planning, and soil management [28].
Computer vision systems can classify food products, detect defects, and estimate quality attributes without causing damage [29]. In recent decades, imaging technologies have transformed fruit and vegetable quality assessment [30]. Hyperspectral imaging (HSI) is a leading non-destructive quality evaluation tool for agricultural products. HSI combines imaging and spectroscopy to provide spatial and spectral data for each pixel, allowing a detailed analysis of chemical composition and physical features [26]. HSI is ideal for real-time quality monitoring and in-line classification in the food sector due to its ability to detect huge and diverse samples quickly [31]. HSI operates over a wide spectral range, typically from 500 to 2500 nanometers, capturing subtle variations in reflectance and absorption [32]. By analyzing the unique spectral signatures of crops, HSI aids in the timely detection of diseases and deteriorations [33]. The ability of HSI to map the spatial distribution of plant components offers a powerful tool for the authentication and analysis of agricultural and food products [34]. HSI is a non-invasive and precise alternative to chemical assays for plant monitoring. The technology also enables the estimation of crop yield and nutrient content before harvest [35]. Without damaging the sample, HSI can measure critical nutrients [36] and metabolites [22].
Machine learning (ML) is used to interpret the enormous volume of data produced by an HSI system—producing an evaluation of the characteristic of plants or soil. It represents a type of artificial intelligence (AI) that enables computers to learn from accessible data. The use of spectral data in conjunction with ML techniques to predict polyphenol content in microgreens is novel, and the efficient integration of these techniques has yet to be studied, as has the use of several ML models with varying input configurations to achieve the best results. The selection of the appropriate ML model with high accuracy and efficiency is crucial for handling high-dimensional hyperspectral data.
Regression in machine learning is a supervised learning technique used to investigate the relationship between independent features and a dependent outcome. Regression models are suitable when the outcome is a continuous value. This study tested linear, non-linear, and ensemble-based regression models—including Support Vector Regression (SVR), random forest, linear regression, ridge regression, lasso regression, K-nearest neighbors (KNN), elastic net, and gradient boosting. These models were chosen because they differ in complexity, interpretability, and predictive performance. Linear regression is a simple model that helps establish the baseline performance. Ridge regression incorporates at least one constraint into a regression equation by adding a regularization term to the cost function, suitable for handling multicollinearity [37] Lasso, stands for the least absolute shrinkage and selection operator, incorporating regularization and feature selection [38]. Elastic net combines the regularization terms of ridge and lasso regressions. Random forest and gradient boosting capture non-linear relationships and have the ability to handle overfitting [39]. SVR is kernel-based and effective for high-dimensional data. KNN, a non-parametric model, learns directly from the data without assuming a predefined functional relationship [40].
Despite its extensive use in larger crops, HSI is underutilized in studying microgreens. Microgreens provide a good testing organism for determining the correlation between HSI data and plant biochemical composition as they are quick, inexpensive, and easy to grow. The lack of research on the effective customization of HSI to monitor the dynamics of metabolites in microgreens is a critical gap. To fill these gaps, this study demonstrates how HSI technology can detect and visualize Total Phenolic Content (TPC) and the Total Flavanoid Content (TFC) in the cultivation of microgreens.
The main objective of this study is to develop and validate an HSI-based ML protocol for the quantification of TPC and TFC using samples of sunflower microgreens. Specifically, this study aims to (i) collect hyperspectral reflectance data from sunflower microgreens on different days of cultivation (Days 5–7) and (ii) analyze spectral variations between plant parts (leaves versus stems); (iii) train and evaluate multiple ML models to predict TPC and TFC using hyperspectral data; (iv) apply principal component analysis (PCA) to compare the predictive performance of complete spectral data versus reduced features; and (v) demonstrate the model’s capability to map biochemical content for microgreens. By addressing these objectives, this study contributes to developing a scalable and automated secondary-metabolite assessment approach for microgreens. Using HSI and ML models can help retrieve secondary metabolites quickly and accurately, helping to determine harvest times, improve crop management, and optimize health benefits.

2. Methodology

2.1. Sample Preperation

Commercial large striped sunflower seeds, purchased from Supertop Greenhouse (Thailand), were soaked in water for 12 h to stimulate germination, followed by thorough rinsing and draining. The seeds were then stored in a container covered with a damp cloth and kept away from direct sunlight for an additional 12 h to maintain moisture. The cultivation was soil-based, utilizing standard commercial soil (Din-na-mom; Thailand). Shallow trays, 60 c m (length) × 30 c m (width) × 3 c m (depth), with drainage holes were used to maximize the growing area.
On Day 1, six trays were filled to approximately three-quarters of their capacity with soil, creating a soil depth of roughly 2 c m . In each tray, 100 g of seeds were evenly distributed over the soil surface and firmly pressed into the soil using a soil tamper. The seeds were germinated by keeping the soil tampers on the trays for 48 h. On Day 3, the soil tampers were removed and the prepared trays were sprinkled with a thin layer of soil before being dampened using a mister to avoid disturbing the seeds. The trays were then covered with lids to maintain darkness for an additional 24 h. By Day 4, the seeds had germinated. The trays were subsequently placed on the shelves of a custom-made rack in a greenhouse under a natural light–dark cycle. The ambient conditions, including temperature, relative humidity (%RH), and CO2 concentration, were maintained at 30 ± 2 °C, 75 ± 5%, and 400 ± 40 ppm, respectively. The seedlings were irrigated by misting every 12 h. Sampling of sunflower microgreens was conducted on Days 5, 6, and 7. The microgreens were cut approximately 1 cm above the soil. For each day, 42 microgreens were sampled for HSI, resulting in a total of 126 samples. For trait data analysis, 100 g of leaves and 100 g of stems were collected each day. The stems (hypocotyl) and leaves (cotyledons) were separated by cutting 5 m m below the leaf.
Figure 1 outlines the experiment where two sets of microgreen samples were prepared for TPC and TFC measurement and HSI to collect the trait and spectral data.
The prepared samples were then subjected to HSI, using a setup comprised of known light sources and a hyperspectral camera to capture spectral data. Figure 2a displays the custom-built HSI system, which comprises a hyperspectral camera, controlled illumination, and a motorized stage. Figure 2b is an RGB image of a sunflower microgreen. Figure 2c illustrates a hyperspectral data cube that captures spatial and spectral reflectance information. Figure 2d shows examples of hyperspectral images at different wavelengths.
At the same time, trait data collection was performed to obtain the relevant physical and biochemical attributes of the microgreens. Following imaging, HSI data extraction was conducted to process and analyze the spectral information from the captured images. Subsequent model training was carried out to develop predictive models relating the hyperspectral data to the collected trait data. Finally, the trained model was utilized to establish correlations between hyperspectral characteristics and measured traits, allowing a precise analysis of the characteristics of sunflower microgreens.

2.2. Trait Data Acquisition

Fresh samples were weighed, finely ground, and subject to extraction using 50% ethanol in a 1:1 ratio. The mixture was shaken at 120 rpm for 24 h and subsequently centrifuged at 6000 rpm at 25 °C for 15 min. The supernatant was then filtered under vacuum using Whatman No.1 filter paper and diluted with 50% ethanol for further analysis. TPC was measured using the Folin–Ciocalteu method [41]. A 20 µL aliquot of the extract was mixed with 100 µL of 10% Folin–Ciocalteu reagent and 80 µL of 7.5% Na2CO3 in a microplate. The mixture was incubated in the dark for 30 min at room temperature, and absorbance was measured at 765 nm. Gallic acid was used as the standard, and the results were expressed in mg GAE/100 g sample.
The measurement of TFC followed a modified protocol [42]. A 500 µL sample extract was mixed with 100 µL of 10% aluminum chloride and incubated for 5 min. Next, 100 µL of 5% sodium nitrite was added, followed by a 6 min incubation. A final addition of 300 µL of 1 M sodium hydroxide was made, and the mixture was incubated in the dark for 30 min at room temperature. Absorbance was recorded at 430 nm using quercetin as the standard, and the results were expressed in mg quercetin equivalent per 100 g of sample (mg QE/100 g sample).

2.3. Spectral Data Acquisition

This study used a custom-built HSI system, which is compact and re-configurable and is composed of commonly accessible optical components for cost-effectiveness. The system was operated in reflectance mode via pushbroom scanning at a speed of 0.57 mm/s. The system collects the spectral bands ranging from 450 to 854 nm, with a spectral resolution of around 1.6 nm. The measured in-track and cross-track resolutions were approximately 1.24 lp/mm and 2.05 lp/mm, respectively. These results correspond to spatial resolution values of approximately 0.81 mm and 0.49 mm for in-track and cross-track directions, respectively. Before scanning the samples, the HSI apparatus was calibrated. The sample’s calibrated reflectance, R S a m p l e , was obtained via the following:
R s a m p l e = ( I s a m p l e I d a r k ) / ( I r e f I d a r k ) × R r e f
where I s a m p l e is the sample’s intensity measured by the HSI system. R r e f is a known reflectance of the standard reference, a Spectralon reflectance standard (SRS-40-010; Labsphere, Inc., North Sutton, NH, USA). I r e f is the reference’s intensity measured by the HSI system. I d a r k is the reflection intensity measured by the HSI system under dark conditions. PCA and K-means clustering were applied to the obtained spectra; the objective was to facilitate the interpretation of spectral characteristics and improve the selection of characteristics for subsequent analysis.

2.4. Machine Learning Models

To predict the TPC and TFC of sunflower microgreens based on HSI images, the following ML models were used: elastic net, gradient boosting, KNN, lasso regression, linear regression, random forest, ridge regression, and SVR. The study employed multiple regression models to predict TPC and TFC in sunflower microgreens using HSI data. The dataset was pre-processed by standardizing the spectral features. The standardized dataset was then split into training (80%) and testing (20%), with a fixed random seed to maintain reproducibility. The following eight ML models were trained: linear regression, ridge regression, lasso regression, elastic net, random forest, gradient boosting, SVR, and KNN. Linear regression served as a baseline model. Ridge regression ( α = 1.0) and lasso regression ( α = 0.01) introduced L2 and L1 regularization, respectively, to manage multicollinearity and improve feature selection. ElasticNet ( α = 0.01, L1-ratio = 0.5) combined the advantages of lasso and ridge regressions. Ensemble-based models included Random forest, set to 100 estimators, and Gradient Boosting, configured with 100 estimators and a learning rate of 0.1, both of which use decision trees to capture non-linear relationships in spectral data. SVR, with an RBF kernel, used a regularization parameter C = 10 and ϵ = 0.1 to balance margin tolerance and accuracy. The KNN model was set to k = 5 , making predictions based on proximity in the feature space. Hyperparameters were optimized through pre-defined settings. Each model was evaluated on the basis of its ability to predict TPC and TFC from a spectral feature. Model performance was assessed using key statistical metrics, including the coefficient of determination ( R 2 ) and the root mean square error (RMSE). The R 2 values indicated the goodness of fit of the model, while the RMSE quantified prediction errors.

3. Results

3.1. Trait Data

Figure 3 shows the TPC and TFC contents in sunflower microgreens throughout the duration of cultivation (Days 5–7) in leaves and stems. The highest TPC and TFC were observed on Day 5 for both leaves and stems. A statistically significant difference (p < 0.05) was observed between TPC and TFC levels in leaves and stems, with leaves containing significantly higher amounts of both compounds compared to stems. The TPC of the leaf samples were highest on Day 5 at 20.23 ± 0.15 mg GAE/100 g, decreasing to 16.08 ± 0.09 mg GAE/100 g on day 6 and slightly recovering to 19.81 ± 0.13 mg GAE/100 g on Day 7. The TPC in stem samples were significantly lower, with values of 4.99 ± 0.06, 4.80 ± 0.02, and 7.46 ± 0.11 mg GAE/100 g on Days 5, 6, and 7, respectively. The TFC of leaf samples were highest on Day 5 at 91.08 ± 0.05 mg QE/100 g, decreasing to 74.04 ± 1.09 mg QE/100 g on Day 6 and further to 70.96 ± 1.11 mg QE/100 g on Day 7. The TFC in the stems were significantly lower than the leaves, with values of 8.14 ± 0.06, 10.02 ± 0.22, and 15.45 ± 0.12 mg QE/100 g on Days 5, 6, and 7, respectively.

3.2. Spectral Data

Figure 4 illustrates the hyperspectral reflectance analysis and the results of the clustering of PCA and K-means to sunflower microgreens. Figure 4a presents the mean reflectance spectra of leaves and stems. In the green region (550 nm), leaves exhibit a higher reflectance compared to stems due to stronger chlorophyll absorption in stems. In the red region (680–700 nm), both leaves and stems show significant absorption, with a sharper decline in leaf reflectance attributed to the red edge of the vegetation. The most pronounced difference is observed in the NIR region (700–850 nm), where leaves show substantially higher reflectance than stems. This is due to multiple scattering within the internal leaf structure, which enhances NIR reflectance, whereas stems, with denser cellular composition and lower internal scattering. Figure 4b illustrates the variance explained by the first five principal components (PCs). PC1 captures 90% of the total variance, while PC2 accounts for 9% and PC3 contributes less than 1% (see Figure 4c). PC1 exhibits relatively uniform loadings across the observed wavelength range, with positive loadings before 720 nm and negative loadings beyond 720 nm. This suggests that spectral features across the entire range contribute to the primary variance, with a notable shift occurring around the red-edge region. PC2 loadings reveal a prominent feature near 550 nm, indicating that variations in reflectance within the green region play a key role in spectral differentiation. Additionally, PC2 exhibits a sharp positive peak around 720 nm, corresponding to the red-edge transition, a spectral region strongly linked to chlorophyll absorption and plant health indicators. Figure 4d further validates these observations through the clustering of K-means applied to the PCA results, demonstrating a clear separation between the clusters of leaves and stems along PC1. The days of cultivation do not form distinct clusters.

3.3. Performance Matrices

Figure 5 presents the performance evaluation of different ML models to predict TPC and TFC based on hyperspectral data. Regarding TPC, KNN achieved the highest R 2 value (0.95) across all spectral configurations.
Ridge regression and lasso regression also performed well with R 2 values above 0.93. For the prediction of TFC, ridge regression exhibited the highest R 2 (0.97) when using the full spectral data set, closely followed by lasso regression (0.96) and elastic net (0.96). Although PC1 alone captures most of the spectral variance, using only PC1 or PC1 combined with PC2 results in slightly lower model accuracy compared to using the full hyperspectral dataset. This effect is more noticeable for the TFC predictions, where RMSE values were higher when reducing the data dimensions.

3.4. Secondary Metabolite Mapping

Figure 6 illustrates the application of HSI for the dynamic mapping of secondary metabolites in sunflower microgreens. The image is an RGB representation of a sunflower microgreen sample collected on Day 7, scanned under the HSI system. Seven pixels were strategically selected on the microgreen, corresponding to the labeled Points (1–7), to obtain seven spectra. Each spectrum was then input into the pre-trained models as follows: ridge regression for TFC and KNN for TPC. The resulting TFC and TPC values for each point are presented in Table 1.

4. Discussion

4.1. Data Analysis

From the results in Figure 3, TFC in leaves was highest on Day 5 before declining on Days 6 and 7, while stems followed the opposite trend. This pattern suggests that secondary metabolite production is more prominent in the early growth stage, with greater accumulation in leaves than in stems. Secondary metabolites in microgreens typically peak early in development [43], particularly after germination, when plants activate defense mechanisms against environmental stressors [44]. However, more samples are needed to confirm the exact day of peak metabolite levels. In contrast, TPC did not follow a clear trend, highlighting the need for additional data. Future research should consider a larger dataset to explore this further. Spectral reflectance analysis showed that leaves exhibited higher NIR reflectance than stems, probably due to greater light scattering in the leaf mesophyll [45]. This distinction was also reflected in K-means clustering, where PC1 effectively separated leaf and stem spectra.

4.2. Model Performance

Among the ML models tested, the model with the greatest accuracy is KNN for TPC and ridge regression for TFC. For TFC, the input of the full spectra performs slightly better than PC1 alone with around 2% higher R 2 and 10% lower RMSE. For TPC, PC1 performance is comparable to that of the full dataset. Therefore, PC1 can be used effectively to reduce computational complexity without sacrificing accuracy. Combining PC1 and PC2 did not provide significant advantages over PC1 alone. Thus, higher-order PCs do not suggest any significant improvement.
KNN, a non-parametric method, is based on instance-based learning [46], requiring the entire data set to be stored, leading to increased processing demands as the size of the data set becomes larger. Ridge regression, a parametric model, is more efficient and scalable. For the prediction of TFC, the use of full spectral data provided only a minor improvement over PC1 alone. For TPC, PC1 alone performed nearly as well as the full dataset, making it a suitable option to reduce computational complexity while maintaining accuracy. However, spectral noise, cultivation variability, and batch differences can still affect performance. Pre-processing techniques such as smoothing filters or adaptive feature selection could have been used to improve the robustness of the model and are recommended to be implemented in future work [47,48].

4.3. Challenges and Limitations

This study has successfully demonstrated that the integration of HSI with ML can successfully predict secondary metabolite concentrations in microgreens. The further validation of this technique requires expanding the scope of this study to include various microgreen species and cultivation conditions beyond sunflower microgreens.
Computational efficiency is also a concern. Image processing for spectral data extraction took approximately 8 min on a MacBook Air (M2, 2022) with 16 GB RAM. However, once trained, the ML models performed predictions rapidly. The runtime is 0.45 ± 0.21 ms for TPC predictions via KNN and 0.03 ± 0.01 ms for TFC predictions via ridge regression. The observed runtime difference is expected. KNN involves distance calculations and neighbor selection, making it slower than ridge regression, which involves a linear transformation of input features. Increasing the dataset trained on increases the cost and so reducing the dimensionality is crucial to increasing computational efficiency.
This study focused primarily on the content of phenolic and flavonoid compounds due to their known health benefits. Although these compounds play a role in antioxidant activity and disease prevention, they do not fully reflect the overall nutritional value. An alternative dimensionality reduction technique is to incorporate predefined vegetation indices (VIs) with ML, as demonstrated by Pane et al. [49]. Although VIs offer a practical approach to disease detection, they may not capture all spectral variations. Our study addresses this by analyzing a continuous hyperspectral spectrum. However, in resource-limited settings, models that rely on predefined VIs could still be viable if their reliability is thoroughly validated.

4.4. Future Directions

Future studies should include additional bioactive compounds, such as vitamins and minerals, to provide a more comprehensive assessment of microgreen quality. Lighting fluctuations and sensor variations may introduce inconsistencies and spectral noise. Such challenges can be tackled in the data pre-processing step by denoising algorithms such as wavelet transformation, adaptive median filter, and modified decision-based median filter [50,51] Future work could explore deep learning approaches such as the Capsule Attention Network (CAN) for hyperspectral image classification. Wang et al. [52] introduced CAN as a deep learning model that integrates Capsule Networks with an Attention Mechanism, enhancing spectral-spatial feature extraction. In their approach, PCA was applied as a pre-processing step to reduce the spectral bands to 20% before feeding the data into CAN, improving efficiency. However, while CAN optimizes feature representation, deep learning models generally require large labeled datasets and high computational power during training, limiting their real-time applications. Graph-based clustering methods also offer potential improvements in hyperspectral image analysis. Wang et al. [52] proposed a structured doubly stochastic graph-based clustering approach that enhances clustering accuracy by reducing noise and improving intra-cluster connectivity. Although this method improves robustness, it involves solving an optimization problem, which can increase computational costs.

5. Conclusions

The integration of HSI with ML models offers a robust approach to monitoring bioactive compounds in microgreens, enabling the precise, real-time assessment of secondary metabolites without the need for destructive sampling. Dimensionality reduction via PCA can be used to reduce dimensionality and increase computational efficiency without sacrificing the accuracy of predicting TPC and TFC. The performance of PC1 in all models is only slightly inferior to the full spectral dataset (450–850 nm), with up to 2% lower R 2 and 10% higher RMSE. The ability to map metabolites in different parts of microgreens in different stages of cultivation is valuable for optimizing harvest timing to maximize health benefits. This also contributes to better crop management and improved consumer health. Furthermore, the non-invasive nature of HSI supports sustainable agricultural practices by minimizing resource use and reducing waste. The findings from this study can be further developed for smart microgreen farming, enabling the real-time quantification of TPC and TFC to support precise dietary recommendations. Future studies should explore other types of microgreens to expand the applicability of this approach.

Author Contributions

Conceptualization, P.B. and M.P.; methodology, P.B. and M.P.; software, P.P. and P.B.; validation, P.B. and C.D.; formal analysis, P.P., P.B. and C.D.; investigation, P.B. and M.P.; resources, P.B. and P.D.; data curation, P.B., M.P. and P.D.; writing—original draft preparation, P.B.; writing—review and editing, M.P. and C.D.; visualization, P.B.; supervision, C.D.; project administration, C.D.; funding acquisition, P.B. and C.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Prince of Songkla University [Grant No. TAE6702014S] and the National Research Council of Thailand [Grant No. SCI620026S].

Data Availability Statement

The data presented in this study are available on request from the author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Mir, S.A.; Shah, M.A.; Mir, M.M. Microgreens: Production, shelf life, and bioactive components. Crit. Rev. Food Sci. Nutr. 2017, 57, 2730–2736. [Google Scholar] [CrossRef] [PubMed]
  2. Weber, C.F. Broccoli Microgreens: A Mineral-Rich Crop That Can Diversify Food Systems. Front. Nutr. 2017, 4, 7. [Google Scholar] [CrossRef]
  3. Xiao, Z.; Lester, G.E.; Luo, Y.; Wang, Q. Assessment of vitamin and carotenoid concentrations of emerging food products: Edible microgreens. J. Agric. Food Chem. 2012, 60, 7644–7651. [Google Scholar]
  4. Guerreiro, S.L.M.; Cabral Júnior, J.F.G.; Eiras, B.J.C.F.; Miranda, B.d.S.; Lopes, P.C.A.; Melo, N.F.A.C.d.; Luz, R.K.; Sterzelecki, F.C.; Palheta, G.D.A. Integrating Aquaponics with Macrobrachium amazonicum (Palaemonidae: Decapoda) Cultivation for the Production of Microgreens: A Sustainable Approach. AgriEngineering 2024, 6, 2718–2731. [Google Scholar] [CrossRef]
  5. Kyriacou, M.C.; De Pascale, S.; Kyratzis, A.; Rouphael, Y. Microgreens as a component of space life support systems: A cornucopia of functional food. Front. Plant Sci. 2017, 8, 294717. [Google Scholar]
  6. Maluin, F.N.; Hussein, M.Z.; Nik Ibrahim, N.N.L.; Wayayok, A.; Hashim, N. Some Emerging Opportunities of Nanotechnology Development for Soilless and Microgreen Farming. Agronomy 2021, 11, 1213. [Google Scholar] [CrossRef]
  7. Birch, E.L. Food Security. J. Am. Plan. Assoc. 2015, 81, 241–242. [Google Scholar] [CrossRef]
  8. Ghoora, M.D.; Haldipur, A.C.; Srividya, N. Comparative evaluation of phytochemical content, antioxidant capacities and overall antioxidant potential of select culinary microgreens. J. Agric. Food Res. 2020, 2, 100046. [Google Scholar] [CrossRef]
  9. Huang, J.J.; Tan, C.X.; Zhou, W. Universal modeling for optimizing leafy vegetable production in an environment-controlled vertical farm. Comput. Electron. Agric. 2024, 219, 108715. [Google Scholar] [CrossRef]
  10. Huang, M.; Xu, H.; Zhou, Q.; Xiao, J.; Su, Y.; Wang, M. The nutritional profile of chia seeds and sprouts: Tailoring germination practices for enhancing health benefits–a comprehensive review. Crit. Rev. Food Sci. Nutr. 2024, 1–23. [Google Scholar] [CrossRef]
  11. Verma, N.; Shukla, S. Impact of various factors responsible for fluctuation in plant secondary metabolites. J. Appl. Res. Med. Aromat. Plants 2015, 2, 105–113. [Google Scholar] [CrossRef]
  12. Reshi, Z.A.; Ahmad, W.; Lukatkin, A.S.; Javed, S.B. From Nature to Lab: A Review of Secondary Metabolite Biosynthetic Pathways, Environmental Influences, and In Vitro Approaches. Metabolites 2023, 13, 895. [Google Scholar] [CrossRef]
  13. Harborne, J.B.; Williams, C.A. Advances in flavonoid research since 1992. Phytochemistry 2000, 55, 481–504. [Google Scholar]
  14. Ghani, U. Chapter three—Polyphenols. In Alpha-Glucosidase Inhibitors; Elsevier: Amsterdam, The Netherlands, 2020; pp. 61–100. [Google Scholar] [CrossRef]
  15. Savickiene, N.; Raudone, L. Trends in Plants Phytochemistry and Bioactivity Analysis. Plants 2024, 13, 3173. [Google Scholar] [CrossRef]
  16. Jan, R.; Asaf, S.; Numan, M.; Lubna; Kim, K.M. Plant Secondary Metabolite Biosynthesis and Transcriptional Regulation in Response to Biotic and Abiotic Stress Conditions. Agronomy 2021, 11, 968. [Google Scholar] [CrossRef]
  17. Xie, P.J.; Huang, L.X.; Zhang, C.H.; Zhang, Y.L. Phenolic compositions, and antioxidant performance of olive leaf and fruit (Olea europaea L.) extracts and their structure–activity relationships. J. Funct. Foods 2015, 16, 460–471. [Google Scholar]
  18. Zhan, X.; Chen, Z.; Chen, R.; Shen, C. Environmental and Genetic Factors Involved in Plant Protection-Associated Secondary Metabolite Biosynthesis Pathways. Front. Plant Sci. 2022, 13, 877304. [Google Scholar] [CrossRef]
  19. Kim, M.Y.; Kim, J.I.; Kim, S.W.; Kim, S.; Oh, E.; Lee, J.; Lee, E.; An, Y.J.; Han, C.Y.; Lee, H.; et al. Influence of Secondary Metabolites According to Maturation of Perilla (Perilla frutescens) on Respiratory Protective Effect in Fine Particulate Matter (PM2.5)-Induced Human Nasal Cell. Int. J. Mol. Sci. 2024, 25, 12119. [Google Scholar] [CrossRef]
  20. Manach, C.; Scalbert, A.; Morand, C.; Rémésy, C.; Jiménez, L. Polyphenols: Food sources and bioavailability. Am. J. Clin. Nutr. 2004, 79, 727–747. [Google Scholar] [CrossRef]
  21. Scalbert, A.; Manach, C.; Morand, C.; Rémésy, C.; Jiménez, L. Dietary polyphenols and the prevention of diseases. Crit. Rev. Food Sci. Nutr. 2005, 45, 287–306. [Google Scholar] [CrossRef]
  22. Yang, C.; Song, L.; Wei, K.; Gao, C.; Wang, D.; Feng, M.; Zhang, M.; Wang, C.; Xiao, L.; Yang, W.; et al. Study on Hyperspectral Monitoring Model of Total Flavonoids and Total Phenols in Tartary Buckwheat Grains. Foods 2023, 12, 1354. [Google Scholar] [CrossRef]
  23. Ahmad, A.; Husain, A.; Mujeeb, M.; Khan, S.A.; Alhadrami, H.A.A.; Bhandari, A. Quantification of total phenol, flavonoid content and pharmacognostical evaluation including HPTLC fingerprinting for the standardization of Piper nigrum Linn fruits. Asian Pac. J. Trop. Biomed. 2015, 5, 101–107. [Google Scholar] [CrossRef]
  24. Sankhalkar, S.; Vernekar, V. Quantitative and Qualitative Analysis of Phenolic and Flavonoid Content in Moringa oleifera Lam and Ocimum tenuiflorum L. Pharmacogn. Res. 2016, 8, 16–21. [Google Scholar] [CrossRef]
  25. Proestos, C.; Boziaris, I.; Nychas, G.J.; Komaitis, M. Analysis of flavonoids and phenolic acids in Greek aromatic plants: Investigation of their antioxidant capacity and antimicrobial activity. Food Chem. 2006, 95, 664–671. [Google Scholar] [CrossRef]
  26. Liu, D.; Zeng, X.-A.; Sun, D.-W. Recent Developments and Applications of Hyperspectral Imaging for Quality Evaluation of Agricultural Products: A Review. Crit. Rev. Food Sci. Nutr. 2015, 55, 1744–1757. [Google Scholar] [CrossRef] [PubMed]
  27. Yan, X.; Shi, W.; Zhao, W.; Luo, N. Estimation of Protein Content in Plant Leaves using Spectral Reflectance: A Case Study in Euonymus japonica. Anal. Lett. 2014, 47, 517–530. [Google Scholar] [CrossRef]
  28. Santos, Y.J.S.; Malegori, C.; Colnago, L.A.; Vanin, F.M. Application on infrared spectroscopy for the analysis of total phenolic compounds in fruits. Crit. Rev. Food Sci. Nutr. 2024, 64, 2906–2916. [Google Scholar] [CrossRef]
  29. Ma, J.; Sun, D.W.; Qu, J.H.; Liu, D.; Pu, H.; Gao, W.H.; Zeng, X.A. Applications of computer vision for assessing quality of agri-food products: A review of recent research advances. Crit. Rev. Food Sci. Nutr. 2016, 56, 113–127. [Google Scholar] [CrossRef]
  30. Li, L.; Jia, X.; Fan, K. Recent advance in nondestructive imaging technology for detecting quality of fruits and vegetables: A review. Crit. Rev. Food Sci. Nutr. 2024, 1–19. [Google Scholar] [CrossRef]
  31. Elmasry, G.; Kamruzzaman, M.; Sun, D.W.; Allen, P. Principles and Applications of Hyperspectral Imaging in Quality Evaluation of Agro-Food Products: A Review. Crit. Rev. Food Sci. Nutr. 2012, 52, 999–1023. [Google Scholar] [CrossRef]
  32. Garg, P.K. 10—Effect of contamination and adjacency factors on snow using spectroradiometer and hyperspectral images. In Hyperspectral Remote Sensing; Pandey, P.C., Srivastava, P.K., Balzter, H., Bhattacharya, B., Petropoulos, G.P., Eds.; Earth Observation; Elsevier: Amsterdam, The Netherlands, 2020; pp. 167–196. [Google Scholar]
  33. Cruz-Carrasco, C.; Díaz-Álvarez, J.; Chávez de la O, F.; Sánchez-Venegas, A.; Villegas Cortez, J. Detection of Aspergillus flavus in Figs by Means of Hyperspectral Images and Deep Learning Algorithms. AgriEngineering 2024, 6, 3969–3988. [Google Scholar] [CrossRef]
  34. Gowen, A.; Odonnell, C.; Cullen, P.; Downey, G.; Frias, J. Hyperspectral imaging—An emerging process analytical tool for food quality and safety control. Trends Food Sci. Technol. 2007, 18, 590–598. [Google Scholar] [CrossRef]
  35. Yang, H.; Chen, Q.; Qian, J.; Li, J.; Lin, X.; Liu, Z.; Fan, N.; Ma, W. Determination of Dry-Matter Content of Kiwifruit before Harvest Based on Hyperspectral Imaging. AgriEngineering 2024, 6, 52–63. [Google Scholar] [CrossRef]
  36. Li, Y.; He, N.; Hou, J.; Xu, L.; Liu, C.; Zhang, J.; Wang, Q.; Zhang, X.; Wu, X. Factors Influencing Leaf Chlorophyll Content in Natural Forests at the Biome Scale. Front. Ecol. Evol. 2018, 6, 64. [Google Scholar] [CrossRef]
  37. Herawati, N.; Wijayanti, A.; Sutrisno, A.; Nusyirwan; Misgiyati. The Performance of Ridge Regression, LASSO, and Elastic-Net in Controlling Multicollinearity: A Simulation and Application. J. Mod. Appl. Stat. Methods 2024, 23, 4. [Google Scholar] [CrossRef]
  38. Simske, S. Introduction, overview, and applications. In Meta-Analytics; Morgan Kaufmann: Cambridge, MA, USA, 2019; pp. 1–98. [Google Scholar] [CrossRef]
  39. Nnaji, C.; Nwodo, U. Predicting Customer Churn In The Telecommunication Industry Using Machine Learning Algorithms: Performance Comparison with Logistic Regression, Random Forest, and Gradient Boosting Techniques. Mach. Learn. 2022, 22, 3–66. [Google Scholar]
  40. Shi, Y.; Yang, K.; Yang, Z.; Zhou, Y. (Eds.) Primer on artificial intelligence. In Mobile Edge Artificial Intelligence; Academic Press: Cambridge, MA, USA, 2022; pp. 7–36. [Google Scholar] [CrossRef]
  41. Owolabi, I.O.; Saibandith, B.; Wichienchot, S.; Yupanqui, C.T. Nutritional compositions, polyphenolic profiles and antioxidant properties of pigmented rice varieties and adlay seeds enhanced by soaking and germination conditions. Funct. Foods Health Dis. 2018, 8, 561–578. [Google Scholar] [CrossRef]
  42. Meilawati, L.; Ernawati, T.; Dewi, R.T.; Meilawati, S.L. Study of Total Phenolic, Total Flavonoid, Scopoletin Contents and Antioxidant Activity of Extract of Ripened Noni Juice. Indones. J. Appl. Chem. 2021, 23, 55–62. [Google Scholar] [CrossRef]
  43. Galieni, A.; Falcinelli, B.; Stagnari, F.; Datti, A.; Benincasa, P. Sprouts and Microgreens: Trends, Opportunities, and Horizons for Novel Research. Agronomy 2020, 10, 1424. [Google Scholar] [CrossRef]
  44. De la Cruz Chacón, I.; Riley Saldaña, A.; González-Esquinca, A.R. Secondary metabolites during early development in plants. Phytochem. Rev. 2012, 12, 47–64. [Google Scholar] [CrossRef]
  45. Castro, K.L.; Sanchez-Azofeifa, G.A. Changes in Spectral Properties, Chlorophyll Content and Internal Mesophyll Structure of Senescing Populus balsamifera and Populus tremuloides Leaves. Sensors 2008, 8, 51–69. [Google Scholar] [CrossRef] [PubMed]
  46. Song, Y.; Liang, J.; Lu, J.; Zhao, X. An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing 2017, 251, 26–34. [Google Scholar] [CrossRef]
  47. Awodeyi, A.I.; Ibok, O.A.; Omokaro, I.; Ekwemuka, J.U.; Ighofiomoni, M.O. Effective preprocessing techniques for improved facial recognition under variable conditions. Frankl. Open 2025, 10, 100225. [Google Scholar] [CrossRef]
  48. Demircioğlu, A. The effect of preprocessing filters on predictive performance in radiomics. Eur. Radiol. Exp. 2022, 6, 40. [Google Scholar] [CrossRef]
  49. Pane, C.; Nicastro, N.; Manganiello, G.; Carotenuto, F.; Pallottino, F.; Costa, C. Hyperspectral imaging to oversee the status of baby leaf vegetable crops: The Agrofiliere Project results. In Proceedings of the 2023 IEEE International Workshop on Metrology for Agriculture and Forestry (MetroAgriFor), Pisa, Italy, 6–8 November 2023; pp. 501–505. [Google Scholar] [CrossRef]
  50. Liu, X.; Li, Z.; Xiang, Y.; Tang, Z.; Huang, X.; Shi, H.; Sun, T.; Yang, W.; Cui, S.; Chen, G.; et al. Estimation of Winter Wheat Chlorophyll Content Based on Wavelet Transform and the Optimal Spectral Index. Agronomy 2024, 14, 1309. [Google Scholar] [CrossRef]
  51. Ullah, F.; Kumar, K.; Rahim, T.; Khan, J.; Jung, Y. A new hybrid image denoising algorithm using adaptive and modified decision-based filters for enhanced image quality. Sci. Rep. 2025, 15, 8971. [Google Scholar] [CrossRef]
  52. Wang, N.; Yang, A.; Cui, Z.; Yao, D.; Xue, Y.; Su, Y. Capsule Attention Network for Hyperspectral Image Classification. Remote Sens. 2024, 16, 4001. [Google Scholar] [CrossRef]
Figure 1. Workflow of trait data collection including the total phenolic content (TPC) and the total flavonoid content (TFC) measurements and hyperspectral imaging (HSI) analysis for sunflower microgreens.
Figure 1. Workflow of trait data collection including the total phenolic content (TPC) and the total flavonoid content (TFC) measurements and hyperspectral imaging (HSI) analysis for sunflower microgreens.
Agriengineering 07 00107 g001
Figure 2. Hyperspectral imaging system used for microgreen analysis. (a) The imaging setup consists of a hyperspectral camera. (b) RGB image of a microgreen sample captured under the system. (c) The corresponding hyperspectral data cube. (d) Selected hyperspectral images at different wavelengths.
Figure 2. Hyperspectral imaging system used for microgreen analysis. (a) The imaging setup consists of a hyperspectral camera. (b) RGB image of a microgreen sample captured under the system. (c) The corresponding hyperspectral data cube. (d) Selected hyperspectral images at different wavelengths.
Agriengineering 07 00107 g002
Figure 3. Variation in the total phenolic content (TPC, mg GAE/100 g) and total flavonoid content (TFC, mg QE/100 g) of the leaves and stems of sunflower microgreens across different growth stages of cultivation (Days 5, 6, and 7).
Figure 3. Variation in the total phenolic content (TPC, mg GAE/100 g) and total flavonoid content (TFC, mg QE/100 g) of the leaves and stems of sunflower microgreens across different growth stages of cultivation (Days 5, 6, and 7).
Agriengineering 07 00107 g003
Figure 4. Principal component analysis (PCA) of the sunflower microgreens’ spectral data: (a) reflectance spectra of the leaves and stems, (b) plot of the explained and cumulative variance of the first five principal components (PCs), (c) PC1 loadings, and (d) K-means clustering based on PCA results.
Figure 4. Principal component analysis (PCA) of the sunflower microgreens’ spectral data: (a) reflectance spectra of the leaves and stems, (b) plot of the explained and cumulative variance of the first five principal components (PCs), (c) PC1 loadings, and (d) K-means clustering based on PCA results.
Agriengineering 07 00107 g004
Figure 5. Matrices illustrating the performances of the tested machine learning models for predicting total phenolic content (TPC) and total flavonoid content (TFC) based on hyperspectral data. When using the full spectral data set, K-nearest neighbors (KNN) achieved the highest R 2 (0.95) for TPC prediction, and ridge regression achieved the highest R 2 (0.97) for the TFC. When using the principal components, PC1 or PC1 combined with PC2 result in slightly lower model accuracy compared to using the full hyperspectral dataset. RMSE values were greater when reducing the data dimensions.
Figure 5. Matrices illustrating the performances of the tested machine learning models for predicting total phenolic content (TPC) and total flavonoid content (TFC) based on hyperspectral data. When using the full spectral data set, K-nearest neighbors (KNN) achieved the highest R 2 (0.95) for TPC prediction, and ridge regression achieved the highest R 2 (0.97) for the TFC. When using the principal components, PC1 or PC1 combined with PC2 result in slightly lower model accuracy compared to using the full hyperspectral dataset. RMSE values were greater when reducing the data dimensions.
Agriengineering 07 00107 g005
Figure 6. RGB image of a sunflower microgreen sample collected on Day 7 of cultivation. Seven positions, Points 1–7, were selected across the microgreen to obtain spectra for mapping of secondary metabolites.
Figure 6. RGB image of a sunflower microgreen sample collected on Day 7 of cultivation. Seven positions, Points 1–7, were selected across the microgreen to obtain spectra for mapping of secondary metabolites.
Agriengineering 07 00107 g006
Table 1. Predicted total phenolic content (TPC) and total flavonoid content (TFC) values at selected points (1–7) on a sunflower microgreen sample collected on Day 7. The values were obtained using pre-trained machine learning models: k-nearest neighbours (KNN) for TPC and ridge regression for TFC. Points correspond to those labeled in Figure 6.
Table 1. Predicted total phenolic content (TPC) and total flavonoid content (TFC) values at selected points (1–7) on a sunflower microgreen sample collected on Day 7. The values were obtained using pre-trained machine learning models: k-nearest neighbours (KNN) for TPC and ridge regression for TFC. Points correspond to those labeled in Figure 6.
Point(1)(2)(3)(4)(5)(6)(7)
TPC (mg GAE/100 g)17.5717.5717.5717.5710.598.875.90
TFC (mg QE/100 g)46.7841.9042.4244.8737.3737.5136.97
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Boonrat, P.; Patel, M.; Pengphorm, P.; Detarun, P.; Daengngam, C. Hyperspectral Imaging for the Dynamic Mapping of Total Phenolic and Flavonoid Contents in Microgreens. AgriEngineering 2025, 7, 107. https://doi.org/10.3390/agriengineering7040107

AMA Style

Boonrat P, Patel M, Pengphorm P, Detarun P, Daengngam C. Hyperspectral Imaging for the Dynamic Mapping of Total Phenolic and Flavonoid Contents in Microgreens. AgriEngineering. 2025; 7(4):107. https://doi.org/10.3390/agriengineering7040107

Chicago/Turabian Style

Boonrat, Pawita, Manish Patel, Panuwat Pengphorm, Preeyabhorn Detarun, and Chalongrat Daengngam. 2025. "Hyperspectral Imaging for the Dynamic Mapping of Total Phenolic and Flavonoid Contents in Microgreens" AgriEngineering 7, no. 4: 107. https://doi.org/10.3390/agriengineering7040107

APA Style

Boonrat, P., Patel, M., Pengphorm, P., Detarun, P., & Daengngam, C. (2025). Hyperspectral Imaging for the Dynamic Mapping of Total Phenolic and Flavonoid Contents in Microgreens. AgriEngineering, 7(4), 107. https://doi.org/10.3390/agriengineering7040107

Article Metrics

Back to TopTop