1. Introduction
Broccoli (
Brassica oleracea L.
var. Italica) belongs to the Cruciferae family and the Brassica genus. It is rich in various nutrients and bioactive compounds that benefit human health, making it widely favored by consumers [
1]. Broccoli is widely grown worldwide, with China as the leading producer. According to the 2023 statistics from the Food and Agriculture Organization (FAO), the harvested area in China reached 498,862 hectares. However, the post-harvest storage of broccoli faces numerous challenges, including temperature and humidity variation. Unfavorable storage conditions accelerate wilting and yellowing, leading to a rapid loss of nutrients and a decline in its functional properties [
2]. The demand for rapid and non-destructive quality assessment of broccoli has become increasingly urgent.
The quality of broccoli is closely linked to its shelf life [
3,
4]. Correct assessment of the shelf life and quality of broccoli is vital for effective resource management and use, and it has a key role in economic estimation. Various methods have been adopted to extend its shelf life, including controlling storage temperature, humidity, and chemical treatments. For example, Patil et al. [
5], Paulsen et al. [
6], and Zhan et al. [
7] have investigated strategies to control post-harvest storage temperature, while Pintos et al. [
8] and Loi et al. [
9] have evaluated the effects of light variables in storage environments. Xu et al. [
10] and Cai et al. [
11] have applied chemical treatments to improve broccoli’s shelf life. However, the shelf life of broccoli is most commonly evaluated by experienced professionals using time-consuming, labor-intensive, subjectively biased sensory methods. Therefore, precise determination of broccoli’s shelf life and quality throughout the supply chain and supermarket retail process is beneficial for effective resource management.
Spectral imaging technology, which integrates traditional imaging with spectral techniques, enables the simultaneous acquisition of spatial and spectral information from samples, providing advantages such as speed and accuracy without damaging the product [
12]. Indeed, spectral imaging combined with machine learning and deep learning algorithms has been widely applied for agricultural product quality and shelf-life assessment [
13,
14]. For example, Sricharoonratana et al. [
15] used hyperspectral imaging combined with partial least squares regression and partial least squares discriminant analysis to accurately predict and classify cake shelf life. Doing so demonstrated the feasibility of combining hyperspectral imaging with machine learning for shelf life prediction. Meanwhile, Siripatrawan and Makino [
16] developed a backpropagation neural network model for the rapid and accurate evaluation of sausage shelf life, showing the effectiveness of combining hyperspectral imaging with deep learning algorithms to evaluate shelf life. Shao et al. [
17] constructed a Library for Support Vector Machines (LIBSVM) model to analyze winter jujube shelf life, achieving accuracies of 89% and 91% for medium and mature jujubes, respectively, and demonstrating the potential of hyperspectral imaging combined with neural networks to monitor post-harvest shelf life. Furthermore, Tang et al. [
18] and Hu et al. [
19] established random forest (RF) and support vector machine (SVM) machine learning models to evaluate tea quality. They confirmed that combining spectral imaging with machine learning may be used to effectively assess tea quality. Collectively, these studies validate the practical application of spectral imaging technology for shelf life prediction. However, changes in the physical appearance of food products during the shelf life are also important factors in evaluating quality.
Yin et al. [
20] integrated texture features, which are key phenotypic features, with spectral information to classify tea quality. Moreover, Zhang et al. [
21] employed near-infrared hyperspectral imaging with spectral and texture features for the non-destructive determination of fat and moisture content in salmon. Due to the cost-effectiveness of multispectral imaging, many researchers have also combined it with machine learning platforms to create rapid and non-destructive quality assessment tools for agricultural products. For example, Lytou et al. [
22] combined multispectral imaging with deep learning for rapid microbial prediction in fish fillets across the supply chain. Meanwhile, Zhang et al. [
23] established a model for predicting dietary fiber content variation at different growth stages of Chinese cabbage by combining multispectral imaging with chemometric methods, achieving an R
2 of 0.9023. This method can be further adapted to provide technical support for produce sorting and grading. Yang et al. [
24] utilized multispectral imaging to rapidly assess seed moisture content, with a backpropagation neural network model achieving an accuracy of 90.1%, providing an effective monitoring method for seed storage. Duan et al. [
25] developed a model for detecting pepper diseases using multispectral imaging combined with texture features. Their findings confirm that integrating texture features with spectral data is more effective than relying on spectral data alone, further validating the feasibility of multi-feature data fusion technology. Therefore, using multispectral imaging technology combined with multi-feature data fusion and machine learning to predict and evaluate the shelf life of broccoli is feasible. This provides a basis for studying chlorophyll and other physicochemical parameters, as well as for determining the shelf life of broccoli.
Therefore, this study combined multispectral imaging technology with multi-feature data fusion and machine learning to develop a rapid and accurate prediction and evaluation model for broccoli shelf life. Specifically, spectral data and texture features were extracted from multispectral images, and the moisture content and photosynthetic pigments (including chlorophyll and carotenoids) of broccoli were measured. Using these data, support vector regression (SVR), RF, and 2D convolutional neural networks (2D-CNN) are incorporated into prediction models for the physicochemical parameters of broccoli throughout its shelf life. Additionally, 1D-CNN, RF, and 2D-CNN were used to develop models for evaluating the shelf life. Subsequently, the prediction and evaluation of broccoli’s shelf life were systematically analyzed, validating the feasibility of combining multispectral imaging technology with multi-feature data fusion and machine learning to achieve rapid, accurate, and non-destructive shelf life prediction and evaluation.
4. Discussion
Differences in the color, appearance, and intrinsic substance content of broccoli are present at different shelf life stages. The different colors and wilting degrees of broccoli buds with different shelf lives affect the multispectral image information. In most studies, quality and shelf life are determined only by predicting the intrinsic substance content using spectral technology [
35] without considering appearance or texture. However, in this study, the spectral data, textural characteristics, and physical and chemical parameters of broccoli were used to predict and evaluate its shelf life.
Compared to existing traditional methods, this study focuses on the predictive and evaluative performance based on multispectral imaging. Conventional approaches, such as manual evaluation, require extensive expertise, are time-consuming, labor-intensive, and highly subjective, making them difficult to widely apply in practical scenarios [
17]. In contrast, this study provides a rapid, non-destructive, and accurate alternative. In contrast to Mohi Alden et al. [
36], who manually assessed cauliflower quality based on appearance, odor, and texture during storage, and Joshi et al. [
37], who determined strawberry shelf life through sensory evaluation, this study employs multispectral imaging to provide more objective quantitative data, thereby enhancing the accuracy and consistency of the evaluation. The physicochemical parameter prediction model for broccoli shelf life developed in this study enables rapid and non-destructive acquisition of these parameters during practical evaluation. This approach reduces the need for destructive measurements of internal substance content changes in broccoli, significantly improving evaluation efficiency and demonstrating the practical feasibility of the non-destructive method proposed in this research. In this study, the performance of the 2D-CNN model combined with multispectral imaging for predicting physicochemical parameters was significantly lower than that of the SVR and RF models based on one-dimensional spectral data, which is consistent with the findings of Zhang et al. [
26] in predicting photosynthetic pigments. For the SVR and RF models utilizing one-dimensional spectral data, the SPA algorithm was employed to extract relevant spectral bands, effectively removing irrelevant spectral information and thereby enhancing both the computational efficiency and accuracy of the models. Additionally, four preprocessing methods were applied to the raw spectral data, resulting in improved model accuracy compared to the original data model.
Compared to traditional data fusion methods [
38], such as the integration of spectral data and texture features, this study introduces physicochemical parameters, incorporating variations in the internal substance content during broccoli shelf life. This approach further enriches the feature information and enhances the predictive capability of the model, highlighting the advantages of multi-feature fusion. To the best of our knowledge, traditional data fusion methods have primarily been applied to tea grade classification [
20] and meat shelf-life assessment [
22], with limited application in fruit and vegetable shelf-life evaluation. This study advances the field by further enriching feature information through the incorporation of physicochemical parameters into the existing data fusion framework. In contrast to Rabasco-Vílchez et al. [
39], who used near-infrared spectral data to assess strawberry shelf life based on storage time and temperature, this study achieves non-destructive prediction of physicochemical parameters and integrates spectral data, texture features, and physicochemical parameters through multi-feature data fusion for comprehensive broccoli shelf-life evaluation. Compared to Li et al. [
40], who predicted kiwifruit shelf life by assessing substance content using near-infrared spectroscopy, this study further incorporates texture features, capturing both external spatial information and internal substance content, thereby improving evaluation accuracy. This study leverages multi-feature data fusion technology to provide physicochemical parameters reflecting internal substance content and texture features capturing external spatial information, which are crucial for the comprehensive evaluation of appearance changes and internal quality in vegetables such as broccoli. This approach demonstrates superior overall performance. In this study, the SPA+SG+RF model within the RF framework emerged as the best model for broccoli shelf life prediction, achieving training and testing accuracies of 88.98% and 88.64%, respectively. In contrast, the 2D-CNN classification model based on multispectral images showed lower accuracy compared to the 1D-CNN and RF models utilizing multi-feature data fusion. This is because the 2D-CNN model takes 100 × 100 × 19 multispectral images as input without extracting spectral features, and the small gaps and shadows between broccoli florets in the images affect model accuracy. Comparative results from three multi-feature fusion approaches indicate that models combining LBP and GLCM texture features with spectral data outperform those using spectral data alone, while the fusion of spectral data, LBP and GLCM texture features, and physicochemical parameters yields the best performance. This further demonstrates that integrating LBP, GLCM texture features, and physicochemical parameters enhances the accuracy of broccoli shelf-life evaluation and the feasibility of multi-feature data fusion.
Although this study collected a substantial dataset and enriched feature information through multi-feature data fusion, it must be acknowledged that misclassifications still exist, as evidenced by the results of the best classification model (
Figure 12). To achieve more accurate broccoli shelf-life evaluation in real-world applications, further optimization of the model structure is necessary to enhance its precision. Additionally, the diversity of the dataset (e.g., variations in growing environments and storage conditions) remains relatively limited, which may constrain the model’s generalizability in diverse retail or supply chain scenarios. Future research could expand the dataset to include broccoli samples from different growing environments and storage conditions, explore additional types of spectral image data and physicochemical parameters, and further refine data fusion methods to improve model generalizability. Meanwhile, with the continuous advancement of artificial intelligence and deep learning technologies, more sophisticated neural network models are expected to provide even more precise shelf life prediction and evaluation. Furthermore, this research could be extended to the shelf-life assessment of other vegetables and fruits, offering theoretical support and practical guidance for advancements in food preservation technology.