Next Article in Journal
The Case for Advanced Recycling as a Path to Sustainable Food Packaging for Specialized Nutrition Products
Previous Article in Journal
Functional Foods Based on Postbiotics as a Food Allergy Treatment
Previous Article in Special Issue
A Review on Replacing Food Packaging Plastics with Nature-Inspired Bio-Based Materials
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hyperspectral Imaging for Quality Assessment of Processed Foods: A Case Study on Sugar Content in Apple Jam

by
Danila Lissovoy
1,†,
Alina Zakeryanova
1,†,
Rustem Orazbayev
1,
Tomiris Rakhimzhanova
2,
Michael Lewis
1,
Huseyin Atakan Varol
2 and
Mei-Yen Chan
3,*
1
School of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, Kazakhstan
2
Institute of Smart Systems and Artificial Intelligence, Nazarbayev University, Astana 010000, Kazakhstan
3
Department of Biomedical Sciences, School of Medicine, Nazarbayev University, Astana 010000, Kazakhstan
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Foods 2025, 14(21), 3585; https://doi.org/10.3390/foods14213585
Submission received: 27 August 2025 / Revised: 19 September 2025 / Accepted: 9 October 2025 / Published: 22 October 2025

Abstract

Apple jam is a widely used all-season product. The quality of the jam is closely related to its sugar concentration, which affects its taste, texture, shelf life, and legal compliance with production requirements. Although traditional methods for measuring sugar, such as titration, enzymatic methods, and chromatography, are accurate, they are also invasive, destructive, and unsuitable for rapid screening. This study investigates a non-destructive and non-invasive alternative method that uses hyperspectral imaging (HSI) in combination with machine learning to estimate the sugar content in processed apple products. Eight cultivars were selected from the Central Asian region, recognized as the origin of apples and known for its rich diversity of apple cultivars. A total of 88 jam samples were prepared with sugar concentrations ranging from 25% to 75%. For each sample, several hyperspectral images were obtained using a visible-to-near-infrared (VNIR) camera. The acquired spectral data were then processed and analyzed using regression models, including the support vector machine (SVM), eXtreme gradient boosting (XGBoost), and a one-dimensional residual network (1D ResNet). Among them, ResNet achieved the highest prediction accuracy of R2 = 0.948. The results highlight the potential of HSI and machine learning for a fast, accurate, and non-invasive assessment of the sugar content in processed foods.

1. Introduction

Apples are one of the most widely consumed and cultivated fruits worldwide, and serve as a good source of vitamins and antioxidants. Regular consumption of fruits, including apples, has been associated with health benefits such as a lower risk of developing asthma and cardiovascular diseases, and some types of cancers [1]. The origins of this nutritionally valuable fruit can be traced back to Central Asia, which is widely recognized as the primary center of genetic diversity, due to the presence of the wild ancestor of modern apple cultivars Malus sieversii [2]. The region’s unique ecological [3] conditions and long history of cultivation have contributed to an exceptionally rich apple gene pool, featuring diverse fruit sizes, shapes, colors, textures, and flavors. This natural variability makes Central Asian apples an ideal subject for studies focused on this area [2].
Apples are seasonal fruits. To prolong their availability beyond the harvesting season, they are frequently preserved in the form of jams, jellies, and marmalades. Apple jam is a processed product made by boiling apple puree with sugars, as sucrose or fructose syrup [4]. Over the years, this traditional practice of fruit preservation has developed into a worldwide industry. Increasing demand for natural, fruit-based products has fueled the fruit jam, jelly, and preserves market, which was worth USD 1.2 billion in 2024 and is predicted to reach USD 1.8 billion by 2033 [5].
Accurate sugar concentration in apple jam is crucial for maintaining quality, ensuring safety, and complying with legal requirements. According to the Codex Alimentarius, the total soluble solids (TSSs) must be at least 60–65% [4]. Insufficient sugar can compromise product quality through spoilage, while excessive intake poses health risks such as weight gain and obesity [6], highlighting the need for precise measurement methods [7,8]. Traditional methods for analyzing sugar content, such as titration [9,10], chromatography (e.g., high-performance liquid chromatography [11,12], gas chromatography–mass spectrometry [13]), and enzymatic assays [14], are highly accurate but also destructive, time-consuming, and require well-trained personnel and specialized laboratory conditions [15]. Faster, non-destructive alternatives, such as refractometry [16], near-infrared spectroscopy [17,18], and Fourier transform infrared spectroscopy [19], are commonly used; however, they lack spatial resolution [20], which renders them less effective in industrial conditions where real-time monitoring is necessary.
Hyperspectral imaging (HSI), especially when combined with machine learning techniques, offers a non-destructive and cost-effective alternative that requires minimal sample preparation and does not necessitate a sterile laboratory setting [15]. HSI has gained popularity in various fields, including ecology [21,22], medicine [23], and food quality control [24] due to its high precision and capacity to automatically analyze large amounts of data.
In the food industry, HSI is used to assess physical and chemical quality characteristics in products such as meat [25,26], tea [27], and coffee [28], as well as to monitor the health of crops [24]. These applications improve quality control while also reducing operational costs through process automation and early detection of irregularities.
While existing research studies have demonstrated the capability of HSI in assessing the quality of fruits and vegetables, including moisture, firmness, and ripeness, its application in processed food products remains underexplored. One frequently targeted parameter is soluble solids content (SSC), which refers to the concentration of sugars, acids, and other dissolved compounds that influence taste, maturity, and commercial value. As summarized in Table 1, many studies have successfully employed HSI in combination with machine learning techniques, including classical algorithms and deep learning models, to predict SSC and related attributes in the raw product. For example, Rady et al. [29] applied partial least squares regression (PLSR) and neural networks to evaluate glucose levels in potatoes, while Yun et al. [30] utilized a 1D ResNet to enhance the predictive accuracy of state-of-the-art techniques for tomato firmness. Similar approaches have been applied to grapes [31], kiwis [32], strawberries [33], peaches [34], and apples [35,36,37], demonstrating high predictive accuracy.
Although HSI is well established in raw produce quality assessment, its use for processed fruit products remains underexplored [38,39]. Processed foods introduce additional complexity due to heterogeneous composition and visual noise, which makes non-destructive quality assessment more challenging.
The literature review did not identify any studies that investigated an application of hyperspectral imaging combined with machine learning to estimate sugar content in a processed food product, such as apple jam. Jam is defined as a product brought to a suitable consistency, made from whole fruit, pieces of fruit, unconcentrated and/or concentrated fruit pulp or fruit puree of one or more kinds of fruit, mixed with sweetening foodstuffs with or without the addition of water [4]. This research addresses this gap by applying HSI, coupled with machine learning models, to assess the sugar content of apple jam, a commonly consumed processed fruit product. In this context, machine learning enables the automatic discovery of patterns in the hyperspectral data, allowing sugar concentration to be predicted from image information without destructive testing and makes the following main contributions:
  • Release of an open-source, well-structured, annotated hyperspectral image dataset of apple jam prepared from 8 apple cultivars native to Central Asia, the primary center of apple genetic diversity, across eleven sugar levels.
  • Acquisition of 1760 annotated hyperspectral images under controlled lighting, angle, and distance conditions using a visible to near-infrared (VNIR) hyperspectral camera, to provide a standardized and reproducible basis for machine learning and deep-learning experiments.
  • Comparative analysis of hyperspectral and RGB imaging for sugar content prediction in apple jam, with model performance assessed using standard regression metrics (R2, RMSE, MAE).
  • Benchmarking of classical machine learning methods (SVM, XGBoost) and deep-learning models (1D ResNet) for regression of sugar concentration levels, with systematic assessment across multiple spatial grid configurations and data-splitting strategies.
  • The dataset, code, and trained models are publicly released on GitHub to support reproducibility and further research.

2. Materials and Methods

2.1. Experimental Setup

Eight apple cultivars that are common to the Central Asia region, such as Granny Smith, Aport, Gala, Starcrimson, Idared, Golden, Simirenko, and Red Jonaprince, were chosen as a raw material for this study (see Figure 1). Each of these apple cultivars was brought in an amount of 3 kg and then was thoroughly washed, peeled, deseeded, and blended into a homogeneous mass using a conventional food processor. Then, as a next step, sucrose was added in controlled amounts of 5% increments to obtain 11 sugar concentration levels ranging from 25% to 75%. Each mixture was cooked under controlled and consistent conditions until a jam-like texture with uniform consistency and color was obtained. The prepared samples were then distributed on a white dish in three layers of varying thicknesses (0.5 cm, 1 cm, and 2 cm) to obtain hyperspectral images under different conditions.
After the jam samples were prepared, hyperspectral images were acquired using a controlled procedure to ensure consistency and reproducibility across all samples. The imaging system consisted of a portable hyperspectral camera, Specim IQ (Specim, Spectral Imaging Ltd., Oulu, Finland) [40], which operates at a wavelength of 400–1000 nm in the VNIR range and collects data in 204 spectral ranges with a spectral resolution of approximately 3 nm. The camera features a spatial resolution of 512 × 512 pixels and a built-in lens with a diameter of 18 mm, optimized for close-up shooting.
To ensure uniform illumination, two 50 W halogen light sources were placed symmetrically at an angle of 45 degrees on both sides of the camera at a height of 50 cm from the sample surface. This configuration minimizes shadows and specular reflections while ensuring uniform illumination throughout the field of view. The hyperspectral camera was mounted on a fixed tripod and positioned directly above the sample to obtain an image from top to bottom, and then shifted at an angle of 45 degrees to obtain an image from the side to capture additional variability of the surface and geometric dimensions (see Figure 2).
The images were taken from three distances: 20 cm, 30 cm, and 40 cm, to ensure variability of spatial resolution and perspective. Radiometric calibration was performed before each imaging session using the integrated procedures provided by Specim IQ. This included capturing a white reference image using the white dish where the jam was placed, and a dark reference image by closing the camera shutter. The calibration data were then used to compute pixel-wise reflectance values R according to the following equation:
R = I D W D
where I is the raw intensity image, D is the dark reference, and W is the white reference. This correction accounts for sensor noise and illumination nonuniformity, ensuring that the resulting reflectance spectra are consistent across different imaging sessions.
The imaging procedure was carried out for all eight apple cultivars, across eleven sugar concentration levels, and three jam thicknesses (0.5 cm, 1.0 cm, and 2.0 cm). For the 0.5 cm thickness, images were acquired from six different angles, whereas for the 1.0 cm and 2.0 cm thicknesses, images were captured from seven angles. As a result, 1760 hyperspectral images were acquired.

2.2. Data Acquisition and Preprocessing

After all the images were acquired, the hyperspectral data, consisting of multiple files generated by Specim IQ per capture, were extracted. For each sample, the .hdr and .dat files representing a hyperspectral cube were isolated as the primary source of reflectance data.
To identify and isolate the jam-covered area from the background, a binary mask was constructed for each image using the spectral angle mapper (SAM) algorithm. This method computes the angular similarity between each pixel’s spectral vector p and a predefined reference spectrum r, obtained from a homogeneous jam region. The spectral angle θ is defined as:
θ ( p , r ) = cos 1 ( p r p , r )
A threshold of 0.2 radians was applied, and pixels satisfying this criterion were considered part of the jam. The smallest bounding box enclosing these pixels defined the region of interest (ROI), and each image was subsequently cropped to this ROI, the example of which is enclosed in a green rectangle in Figure 3.
While the ROI extraction effectively isolated the jam-covered areas, some cropped images still contained irrelevant visual information, such as the edges of dishes or the background. To minimize this noise and maximize the presence of jam content, a 10% margin was applied from each side of the cropped images prior to subdivision.
Then, each cropped hyperspectral image was subdivided into multiple grids (2 × 2, 3 × 3, 4 × 4, and 5 × 5) to investigate how varying the level of spatial subdivision would affect model learning. These were compared against the original cropped images without any grid applied (1 × 1), allowing for a systematic evaluation of spatial resolution versus predictive performance. This strategy not only expanded the dataset but also aimed to identify the most effective grid configuration for improving generalization in the context of a relatively limited sample size of 1760 full images. An illustration of the applied grid configurations is provided in Figure 3.
Each subregion was processed independently in the following steps. For each of the generated subregions, segmented pixels were used to compute a mean spectral vector, which was then normalized to unit length. This resulted in a compact and consistent feature representation for every jam region.

2.3. Statistical Analysis of the Dataset

The finalized dataset consisted of 1760 cropped hyperspectral images of apple jam prepared from eight cultivars native to Central Asia across eleven sugar concentration levels: 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%. Data were stored in tabular comma-separated values (CSV) format, where each row corresponds to one subregion. In total, five separate datasets were generated based on the grid configurations (1 × 1, 2 × 2, 3 × 3, 4 × 4, and 5 × 5), each representing a different level of spatial subdivision. In addition to the normalized spectral features, each record includes associated metadata: sugar content, apple cultivar label, original image index, and grid position.
To prevent data leakage and evaluate model generalization, we conducted two separate data splitting experiments. In the first experiment, we employed a cultivar-based strategy: out of eight apple cultivars, six (75%) were used for training, one (12.5%) for validation, and one (12.5%) for testing. This strict split ensured that the model was evaluated on completely unseen apple types, allowing us to assess its ability to generalize to novel cultivars. This strategy was consistently applied across all five grid configurations to examine how spatial resolution affects prediction accuracy.
In the second experiment, as illustrated in Table 2, we adopted a sugar concentration- based split. Instead of holding out entire cultivars, we assigned specific sugar concentrations of each apple type exclusively to either the test (≈15.9%) or validation set (≈13.6%), ensuring that those concentrations were absent from the training set (≈70.5%). This arrangement allowed the model to encounter every cultivar during training.
By comparing these splitting approaches, we aimed to understand how the breadth and nature of training data influence the model’s reliability on unseen cases.

2.4. Spectral Analysis of Apple Cultivars

To provide an overview of the acquired hyperspectral data, the mean reflectance spectra of all eight apple cultivars at eleven sugar concentration levels (25–75%) are presented in Figure 4. Despite the addition of sugar, the overall spectral characteristics remain the same for all varieties, showing a characteristic dip in the 400–500 nm range and reflectance peaks in the 700–900 nm range. Differences among cultivars can still be observed, for example, Idared generally displays higher reflectance across most wavelengths compared to Granny Smith and Gala. The visible differences highlight the potential of hyperspectral sensing to discriminate between apple cultivars and quantify sugar content in apple jams.

2.5. Machine Learning

Three regression algorithms were evaluated for predicting sugar content from hyperspectral data: support vector machine (SVM) [41], eXtreme gradient boosting (XG- Boost) [42], and one-dimensional residual network (1D ResNet) [43]. Each model was trained and validated using preprocessed spectral data obtained from five grid configurations (1 × 1, 2 × 2, 3 × 3, 4 × 4, and 5 × 5).

2.5.1. SVM

SVM is a kernel-based supervised learning algorithm commonly used for modeling complex nonlinear relationships in medium-sized regression tasks. In this study, SVM was selected as a classical baseline due to its robustness and strong performance on high-dimensional data such as hyperspectral spectra. We employed the radial basis function (RBF) kernel, which is well-suited for capturing nonlinearity in spectral features [44]. Prior to training, all input vectors were scaled to the [0, 1] range using MinMax normalization to ensure numerical stability.
A grid search was conducted to identify optimal hyperparameters. The best-performing configuration was found to be C = 110, ε = 0.1, and γ = 0.01, based on validation performance. These values provided a good balance between model flexibility and generalization ability.

2.5.2. XGBoost

XGBoost is a gradient-boosted decision tree algorithm widely recognized for its efficiency, scalability, and strong predictive performance on structured tabular data. In this study, XGBoost served as a classical ensemble-based baseline to model nonlinear dependencies in spectral features. Prior to training, all input features were scaled to the [0, 1] range using MinMax normalization. The training data were converted into DMatrix objects for optimized handling within the XGBoost framework.
The model was configured with a learning rate of 0.10, a maximum tree depth of 5, and a column subsampling ratio set to 0.95 to reduce overfitting. Training was conducted for up to 400 boosting rounds, with early stopping applied if the validation root mean square error (RMSE) did not improve for 40 consecutive rounds. The objective function was set to squared error regression.
D ResNet
The 1D ResNet used in this study was specifically implemented to handle sequential spectral inputs. The architecture begins with a custom-padded 1D convolutional layer, followed by 8 residual blocks. Each block consists of two convolutional layers with batch normalization and ReLU activation, along with dropout layers. Downsampling is applied every second block, and the number of filters doubles at regular intervals, allowing the net- work to progressively increase feature dimensionality while reducing temporal resolution. This enables the model to capture both fine-grained and high-level spectral patterns. The network concludes with global average pooling and a fully connected regression head.
The model was implemented in PyTorch and trained using the Adam optimizer with a fixed learning rate of 10−3 and weight decay of 10−4. The loss function used was the mean absolute error (MAE), and training was limited to 30 epochs, with early stopping based on validation performance. If the validation MAE did not improve for 5 consecutive epochs, training was halted.
Before training, all spectral inputs were normalized to the [0, 1] range using MinMax scaling, and sugar content values were likewise rescaled. Data were fed into the net-work using PyTorch’s ‘DataLoader’, with shuffling applied during training. Experiments were conducted using multiple batch sizes (32, 64, 128, and 256) to evaluate the model’s sensitivity to input resolution and training dynamics.

3. Results

The models were evaluated on five grid configurations: 1 × 1, 2 × 2, 3 × 3, 4 × 4, and 5 × 5. Performance was assessed using the coefficient of determination (R2), RMSE, and MAE. All models were trained on a Lenovo Legion 5 15ARH05H laptop equipped with an NVIDIA GeForce GTX 1660 Ti GPU. Deep learning experiments were conducted in PyTorch (v2.7.0+cu118) with CUDA (v11.8) acceleration, using 1D ResNet models. Each configuration was trained for up to 30 epochs with early stopping (patience = 5), although convergence was typically achieved within 10–15 epochs. Training time ranged from approximately 5 min (e.g., 1 × 1 grid, batch size 256) to about 1 h (e.g., 5 × 5 grid, batch size 32). The extended runtime observed for the 5 × 5 grid with a batch size of 32 reflects the combined effect of subdividing images into finer grids that created more data segments, and smaller batch sizes which required more iterations to complete each epoch. To ensure stability, each setup was repeated across 5 independent runs to account for training variability and to obtain stable estimates of model performance.
Classical models were trained on a CPU using the same laptop without GPU accel- eration. SVM training completed in under one minute per configuration, while XGBoost required slightly longer due to its iterative tree-building process and early stopping. No parallelization or GPU-based acceleration was applied for these models, and experiments were run using a single CPU thread.
These differences in runtime highlight the trade-off between model complexity and computational demand, illustrating that deep neural networks achieve high representational power at the cost of increased training time, whereas classical models remain lightweight and efficient under CPU-only conditions.

3.1. Experiment with RGB Images

To evaluate the limitations of RGB imaging for predicting sugar content, models were trained exclusively on RGB data using a concentration-based split across grid sizes. As summarized in Table 3, all models exhibited substantially lower accuracy, with the best CNN result reaching only R2 = 0.22, while XGBoost and SVM achieved similarly modest values (R2 = 0.21 and R2 = 0.18, respectively). Furthermore, an increase in the grid size consistently degraded the performance of all models, likely because finer spatial subdivision broke the data into patches too small for already limited spectral information to retain global patterns, which in turn emphasized noise and local fluctuations, thereby limiting the models’ ability to capture meaningful spectral-spatial relationships.
These systematically low results indicate that RGB signals, limited to three broad channels, are insufficient for detecting biochemical variations, such as sugar concentrations, regardless of their spatial subdivision. The results highlight the necessity of hyperspectral imaging (HSI) for this application.

3.2. Cultivar-Based Data Splitting on HSI

The best regression performance was obtained by the ResNet model on the 4 × 4 grid with batch size 64, with an R2 of 0.948, RMSE of 3.622, and MAE of 2.764 (see Table 4). Other top-performing configurations included 5 × 5 with a batch size of 64 (R2 = 0.944) and 4 × 4 with a batch size of 32 (R2 = 0.943), illustrating that generally, moderate spatial resolution (4 × 4 and 5 × 5) proved most beneficial, as finer subdivisions allow the network to capture localized spectral-spatial patterns. Conversely, smaller batch sizes, such as 32 and 64, generally performed better, whereas batch size 256 tended to perform worse. This effect was particularly apparent for the 1 × 1 grid, where a batch size of 256 resulted in a significantly lower mean R2 of 0.292, indicating underfitting when trained with low data variation, reflecting the inability of deep models to extract robust spectral-spatial representations under such constrained conditions.
The SVM model showed strong and stable performance across all grid sizes, reaching its best results at the 4 × 4 configuration with R2 = 0.938, RMSE = 3.940, and MAE = 3.107. This stability highlights SVM’s robustness to data partitioning. Similarly, as shown in Table 4, XGBoost performed best at the 4 × 4 grid (R2 = 0.9344, RMSE = 4.048, MAE = 3.173), with competitive scores across all configurations.
Overall, the 4 × 4 spatial subdivision consistently yielded the best outcomes across all methods. ResNet achieved the strongest individual result, while SVM and XGBoost provided more uniform performance across grid sizes. These findings suggest that utilizing spatial subdivision leads to improved regression accuracy and that neural models, such as ResNet, could have greater potential with proper tuning.

3.3. Sugar Concentration-Based Data Splitting on HIS

Overall, the concentration-based splitting approach resulted in improved regression performance across all models compared to the cultivar-based split, confirming that exposure to partial information from all cultivars allows models to generalize more effectively across sugar concentration levels. According to Table 5, the highest performance was achieved by the ResNet model on the 5 × 5 grid with batch size 64, reaching an R2 of 0.962, RMSE of 2.754, and MAE of 2.129. Other strong configurations included 4 × 4 and 3 × 3 with R2 values of 0.955 and 0.953, respectively. These results reaffirm the benefits of spatial subdivision and suggest that when partial information about all cultivars is available, the model can generalize more easily across sugar levels.
Interestingly, while the SVM model also showed strong performance, peaking at R2 = 0.958 on the 5 × 5 grid (see Table 5), XGBoost exhibited noticeably lower performance than in the cultivar-based experiment. Its best R2 was 0.889 on the 2 × 2 grid, and in contrast to the other models, increasing spatial resolution did not lead to noticeable performance gains, which suggests that kernel-based methods can reliably exploit spectral features regardless of grid configuration. In XGBoost, the lack of benefit from increased spatial resolution suggests that gradient-boosted trees are less adept at capturing subtle spectral-spatial patterns compared to ResNet and SVM.
Among all models, ResNet showed the most significant gain from this split strategy, indicating its ability to capitalize on subtle differences in the data. Both ResNet and SVM benefited from higher spatial resolution, with 5 × 5 and 4 × 4 grids yielding the strongest results. Figure 5 presents a scatter plot of predicted versus actual sugar concentrations on the test set. The close clustering of predicted points around the red ground truth points demonstrates that the 1D ResNet model provides highly accurate predictions across concentration levels.

4. Discussion

This study investigated the potential of combining hyperspectral imaging with machine learning as a non-invasive method for predicting sugar content in processed apple jam. Traditional techniques, although accurate, are often destructive and time-consuming, making them less suitable for high-throughput or real-time applications [45]. By using spectral data captured from a portable VNIR camera and applying both classical and deep learning models, we aimed to evaluate a fast and scalable alternative for quality assessment in processed foods.
The choice of hyperspectral imaging was not arbitrary, but necessary, as conventional RGB images provide only three broad visible bands (red, green, and blue) with limited spectral resolution. The large gap in predictive power comes from the difference in both the number and range of spectral bands. RGB images capture only three broad channels in the visible spectrum (around 400–700 nm), which provide limited information about the sample. Hyperspectral imaging, by contrast, records 204 narrow, contiguous bands that extend into the near-infrared region up to 1000 nm. This wider and more detailed coverage enables the system to detect subtle variations in reflectance that are directly influenced by the chemical composition of the jam, including differences related to sugar content. The results in Table 3 confirm that hyperspectral imaging is indispensable for reliable sugar content estimation in processed products.
Previous studies have applied HSI to predict sugar content in raw agricultural products, including tomatoes (R2 = 0.90, MSE ≈ 0.018 °Brix) [30], kiwifruit (R2 = 0.95, RPD = 3.26) [32], and peaches (R2 = 0.92, RMSE ≈ 0.67 °Brix) [34]. Importantly, these earlier studies evaluated model performance primarily under random data splits, which do not fully test a model’s ability to generalize to unseen cultivars or varying sugar levels. In contrast, our work introduced two more rigorous partitioning strategies: a cultivar-based split, in which entire apple varieties were excluded from training, and a sugar concentration-based split, in which ranges of concentration values were held out. Under these stricter scenarios, our best-performing ResNet model achieved R2 = 0.948 (cultivar split) and R2 = 0.962 (concentration split), with MAE values of about 2.1 °Brix. These results are competitive with, and in some cases exceed, those reported for raw fruits, demonstrating that hyperspectral learning is also effective for processed products, where cooking, homogenization, and the addition of ingredients lead to different optical and chemical properties compared to fresh produce. The incorporation of cultivar- and concentration-based splits provides a more rigorous benchmark for evaluating generalization compared to random partitions, making this study one of the first to report results on processed jams and to apply such splitting strategies in HSI-based food quality assessment.
Under the cultivar-based splitting strategy, where entire apple types were held out from training, the one-dimensional ResNet model consistently outperformed classical methods, achieving a peak R2 of 0.948. Its strong performance is attributed to its ability to capture complex spectral features through deep residual connections. Notably, this splitting approach revealed meaningful and consistent trends across all models. Specifically, the 4 × 4 spatial grid yielded the best results overall, while the 5 × 5 grid performed slightly worse, suggesting that excessive spatial subdivision may dilute the signal with noise. Smaller grids, such as 1 × 1 and 2 × 2, often underperformed, likely due to insufficient spatial diversity. These results suggest an optimal balance near the 4 × 4 configuration, where each patch retains sufficient relevant information while minimizing noise interference. For ResNet, batch size also influenced its performance: moderate values often resulted in more stable and accurate learning outcomes, while the largest size (256) sometimes led to unstable training or reduced accuracy, particularly with smaller grids. These observations make cultivar-based splitting a valuable benchmark, as it presents a more challenging yet interpretable setting for evaluating model behavior in relation to architectural and input design choices.
In contrast, the concentration-based splitting strategy produced higher overall performance across models. ResNet reached a peak R2 of 0.962, while SVM achieved up to 0.958. However, the trends were less consistent. SVM and XGBoost showed nearly uniform performance across all grid sizes, lacking the clear benefit from increased spatial resolution observed under the cultivar split. In particular, XGBoost underperformed relative to its results in the cultivar-based setting, suggesting that it may benefit more from encountering entirely unseen samples rather than partial exposure across types. Only 1D ResNet maintained a meaningful response to spatial subdivision, continuing to improve with finer grids and showing its capacity to learn from localized spectral structures even under less strict splitting.
When comparing the two strategies, it is evident that while the concentration-based approach led to better accuracy, it is a less strict test of the model’s ability to generalize. Since the model sees every apple type during training, it can rely on shared visual and spectral features to estimate sugar content, making the task easier. On the other hand, cultivar-based splitting presents a more realistic and robust scenario where the model must predict sugar content for apple types it has never seen before. Despite slightly lower performance, this setup revealed more meaningful trends, such as consistent improvements with finer grid sizes and more stable performance at intermediate batch sizes. The 4 × 4 grid consistently yielded the best results across models, while the 5 × 5 grid performed slightly worse, suggesting that excessive subdivision may introduce noise or reduce the useful signal.
This contrast is further illustrated in Figure 6, where MAE per class is shown for the best-performing ResNet model under both splitting strategies. Under the concentration- based split, Figure 6b shows that MAE is generally lower and more consistent across sugar levels, indicating that the model benefits from shared visual and spectral patterns seen during training. In contrast, the cultivar-based split results in higher average MAE and greater variability, revealing the model’s sensitivity to truly unseen data and underscoring the difficulty of this generalization setting (see Figure 6a).
While the results demonstrate strong potential for applying hyperspectral imaging and machine learning in processed food analysis, the study was subject to several limitations that should be considered. One limitation observed was that, although most hyperspectral images were of high quality, some showed minor inconsistencies in jam placement or camera angle, which introduced variability in the size and alignment of the cropped regions used for analysis. Another limitation involved occasional reflectance artifacts caused by strong lighting, where overexposed areas were incorrectly represented as dark regions due to an imperfect white reference. These cases were limited in number and did not noticeably affect overall model performance. Nonetheless, future work could improve robustness by implementing a more standardized imaging setup and calibration process. Finally, the scope of the study was limited to a single product type, and further research is needed to assess whether the proposed approach generalizes well to other processed food items with varying textures, compositions, and optical characteristics.

5. Conclusions

In the course of this study, we generated a novel dataset composed of eight apple cultivars, obtained from the most diverse source of apples in the world, comprising 1760 samples. These samples included sugar concentrations ranging from 25% to 75% and were captured at different angles of imaging.
This study demonstrated that hyperspectral imaging, combined with machine learning models, offers a promising non-destructive solution for predicting sugar content in processed apple jam. Among the evaluated methods, the one-dimensional ResNet model consistently achieved the highest predictive accuracy across both data splitting strategies, particularly when applied to moderately segmented spatial regions. While ResNet required longer training times and careful tuning, its strong performance highlights the advantages of deep learning in capturing complex spectral and spatial patterns.
Classical models, such as SVM and XGBoost, although generally less accurate, still produce reliable estimates with significantly lower computational costs and faster training times. These results demonstrate a practical trade-off: deep models offer superior performance for high-precision applications, while classical models may be preferred in settings where speed and simplicity are prioritized.
By focusing on a processed product type that has often been overlooked in previous hyperspectral studies, this work contributes to expanding the applicability of HSI beyond raw products. The proposed approach provides a foundation for developing scalable, real-time quality control tools in food processing environments. Future work may extend these findings to other product types, explore alternative model architectures, and evaluate robustness under industrial conditions. Future work may extend these findings beyond sugar content prediction to other product types, alternative model architectures, and evaluations under industrial conditions. Moreover, hyperspectral imaging could be further applied to critical domains such as food authenticity and food adulteration detection, which are essential for safeguarding product integrity and consumer trust. In this way, HSI has the potential to serve not only as a tool for compositional quality assessment but also as a robust approach for addressing authenticity verification and adulteration monitoring across diverse food systems.

Author Contributions

Conceptualization, H.A.V. and M.-Y.C.; methodology, D.L., A.Z., R.O., T.R. and M.L.; software, D.L., A.Z., R.O. and T.R.; validation, D.L. and A.Z.; formal analysis, D.L. and T.R.; investigation, D.L., A.Z., R.O., M.L. and H.A.V.; resources, D.L., R.O., T.R. and M.-Y.C.; data curation, D.L. and R.O.; writing—original draft preparation, D.L., A.Z., T.R., M.-Y.C. and H.A.V.; writing—review and editing, D.L., A.Z., R.O., T.R., M.L., M.-Y.C. and H.A.V.; visualization, D.L., A.Z., T.R. and H.A.V.; supervision, M.L. and M.-Y.C.; project administration, M.L., M.-Y.C. and H.A.V.; funding acquisition, M.-Y.C. and H.A.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Nazarbayev University, under the Faculty Development Competitive Research Grant Program (Grant No. 201223FD2603).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data, models, and usage instructions are publicly available in our GitHub repository (https://github.com/IS2AI/HSI_Apple_Jam/tree/main, accessed on 10 August 2025) and on our Hugging Face dataset page (https://huggingface.co/datasets/issai/Apples_HSI, accessed on 10 August 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
HSIHyperspectral imaging
VNIRVisible to near-infrared
TSSTotal soluble solids
SSCSoluble solids content
PLSRPartial least squares regression
ROIRegion of interest
CSVComma separated values format
SVMSupport vector machine
XGBoosteXtreme gradient boosting
1D ResNetOne-dimensional residual network
MAEMean absolute error

References

  1. Boyer, J.; Liu, R.H. Apple phytochemicals and their health benefits. Nutr. J. 2004, 3, 5. [Google Scholar] [CrossRef]
  2. Richards, C.M.; Volk, G.M.; Reilley, A.A.; Henk, A.D.; Lockwood, D.R.; Reeves, P.A.; Forsline, P.L. Genetic diversity and population structure in Malus sieversii, a wild progenitor species of domesticated apple. Tree Genet. Genomes 2009, 5, 339–347. [Google Scholar] [CrossRef]
  3. Auyeskhan, U.; Azhbagambetov, A.; Sadykov, T.; Dairabayeva, D.; Talamona, D.; Chan, M.Y. Reducing meat consumption 435 in Central Asia through 3D printing of plant-based protein-enhanced alternatives—A mini review. Front. Nutr. 2024, 10, 1308836. [Google Scholar] [CrossRef]
  4. CXS 296-2009; Standard for Jams, Jellies and Marmalades. Codex Alimentarius, International Food Standards, FAO/WHO: Rome, Italy, 2009.
  5. Verified Market Reports. Apple Jam Market Size and Forecast. 2024. Available online: https://www.verifiedmarketreports.com/product/apple-jam-market-size-and-forecast/#:~:text=Apple%20Jam%20Market%20size%20was%20valued%20at%20USD,increasing%20preference%20for%20natural%20and%20flavorful%20fruit%20spreads (accessed on 23 July 2025).
  6. Razbekova, M.; Issanov, A.; Chan, M.Y.; Chan, R.; Yerezhepov, D.; Kozhamkulov, U.; Akilzhanova, A.; Chan, C.K. Genetic 441 factors associated with obesity risks in a Kazakhstani population. BMJ Nutr. Prev. Health 2021, 4, 90–101. [Google Scholar] [CrossRef] [PubMed]
  7. Khan, S.; Litaf, U.; Shah, S.; Bilal, M.; Khan, A.; Ali, M.; Rani, S.; Shah, F.; Naz, R. Comparative studies on the shelf stability ofdifferent types of apple jams. Pak. J. Food Sci. 2015, 25, 37–42. [Google Scholar]
  8. Rippe, J.M.; Angelopoulos, T.J. Relationship between added sugars consumption and chronic disease risk factors: Current understanding. Nutrients 2016, 8, 697. [Google Scholar] [CrossRef]
  9. Sewwandi, S.D.; Arampath, P.; Silva, A.B.G.; Jayatissa, R. Determination and comparative study of sugars and synthetic colorants in commercial branded fruit juice products. J. Food Qual. 2020, 2020, 7406506. [Google Scholar] [CrossRef]
  10. Taleat, A.; Alabaakanfe, F.; Adeniyi, B. Evaluation of sugar types in selected brands of commercial fruit juice in Osun State, Nigeria. Int. J. Innov. Sci. Res. Technol. 2020, 5, 984–987. [Google Scholar] [CrossRef]
  11. Ma, C.; Sun, Z.; Chen, C.; Zhang, L.; Zhu, S. Simultaneous separation and determination of fructose, sorbitol, glucose and sucrose in fruits by HPLC–ELSD. Food Chem. 2014, 145, 784–788. [Google Scholar] [CrossRef] [PubMed]
  12. Damayanti, S.; Permana, B.; Weng, C. Determination of sugar content in fruit juices using high performance liquid chromatography. Acta Pharm. Indones. 2012, 37, 131–139. [Google Scholar] [CrossRef]
  13. Al-Mhanna, N.M.; Huebner, H.; Buchholz, R. Analysis of the sugar content in food products by using gas chromatography mass spectrometry and enzymatic methods. Foods 2018, 7, 185. [Google Scholar] [CrossRef]
  14. Luzzana, M.; Agnellini, D.; Cremonesi, P.; Caramenti, G.C. Enzymatic reactions for the determination of sugars in food samples using the differential pH technique. Analyst 2002, 126, 2149–2152. [Google Scholar] [CrossRef]
  15. Jie, D.; Xie, L.; Rao, X.; Ying, Y. Using visible and near infrared diffuse transmittance technique to predict soluble solids content of watermelon in an on-line detection system. Postharvest Biol. Technol. 2014, 90, 1–6. [Google Scholar] [CrossRef]
  16. Misto, M.; Mulyono, T.; Cahyono, B. Using Multisample Refractometer to Determine the Sugar Content of Sugarcane Juice in Sugar Factory Besuki. AIP Conf. Proc. 2020, 2278, 020026. [Google Scholar] [CrossRef]
  17. Jarén, C.; Ortuño, J.C.; Arazuri, S.; Arana, I. Sugar Determination in Grapes Using NIR Technology. Int. J. Infrared Millim. Waves 2001, 22, 1521–1530. [Google Scholar] [CrossRef]
  18. Borba, K.; Sapelli, K.S.; Spricigo, P.; Ferreira, M. Near Infrared Spectroscopy Sugar Quantification in Intact Orange. Citrus Res. Technol. 2017, 38, 1–7. [Google Scholar] [CrossRef]
  19. Farooq, Z.; Ismail, A. Successful sugar identification with ATR-FTIR. Agro Food Ind. Hi-Tech 2014, 25, 36–39. [Google Scholar]
  20. Gao, L.; Smith, R.T. Optical hyperspectral imaging in microscopy and spectroscopy—A review of data acquisition. J. Biophotonics 2015, 8, 441–456. [Google Scholar] [CrossRef] [PubMed]
  21. Zhao, Y.; Zeng, Y.; Zheng, Z.; Dong, W.; Zhao, D.; Wu, B.; Zhao, Q. Forest species diversity mapping using airborne LiDAR and hyperspectral data in a subtropical forest in China. Remote Sens. Environ. 2018, 213, 104–114. [Google Scholar] [CrossRef]
  22. Zhang, Y.; Kong, X.; Deng, L.; Liu, Y. Monitor water quality through retrieving water quality parameters from hyperspectral images using graph convolution network with superposition of multi-point effect: A case study in Maozhou River. J. Environ. Manag. 2023, 342, 118283. [Google Scholar] [CrossRef]
  23. Fei, B. Hyperspectral imaging in medical applications. In Data Handling in Science and Technology; Amigo, J.M., Ed.; Elsevier: Amsterdam, The Netherlands, 2019; Volume 32, pp. 523–565. [Google Scholar] [CrossRef]
  24. Temiz, H.T.; Ulaş, B. A Review of Recent Studies Employing Hyperspectral Imaging for the Determination of Food Adulteration. Photochem 2021, 1, 125–146. [Google Scholar] [CrossRef]
  25. Jia, W.; van Ruth, S.; Scollan, N.; Koidis, A. Hyperspectral imaging (HSI) for meat quality evaluation across the supply chain: Current and future trends. Curr. Res. Food Sci. 2022, 5, 1017–1027. [Google Scholar] [CrossRef]
  26. Hitchman, S.; Loeffen, M.P.F.; Reis, M.M.; Craigie, C.R. Robustness of hyperspectral imaging and PLSR model predictions of intramuscular fat in lamb M. longissimus lumborum across several flocks and years. Meat Sci. 2021, 179, 108492. [Google Scholar] [CrossRef]
  27. Hu, Y.; Huang, P.; Wang, Y.; Sun, J.; Wu, Y.; Kang, Z. Determination of Tibetan tea quality by hyperspectral imaging technology and multivariate analysis. J. Food Compos. Anal. 2023, 117, 105136. [Google Scholar] [CrossRef]
  28. Santana, D.C.; Ratke, R.F.; Zanatta, F.L.; Campos, C.N.S.; Seron, A.C.d.S.C.; Teodoro, L.P.R.; Silva, N.P.d.; Oliveira, G.S.; Santos, R.G.d.; Alvarez, R.d.C.F.; et al. Caffeine content prediction in coffee beans using hyperspectral reflectance and machine learning. AgriEngineering 2024, 6, 4480–4492. [Google Scholar] [CrossRef]
  29. Rady, A.; Guyer, D.; Lu, R. Evaluation of Sugar Content of Potatoes using Hyperspectral Imaging. Food Bioprocess Technol. 2015, 8, 995–1010. [Google Scholar] [CrossRef]
  30. Yun, X.; Chen, Q.; Su, Z.; Zhang, L.; Chen, Z.; Zhou, G.; Yao, Z.; Xuan, Q.; Cheng, Y. Deep learning and hyperspectral images based tomato soluble solids content and firmness estimation. Front. Plant Sci. 2022, 13, 860656. [Google Scholar] [CrossRef] [PubMed]
  31. Kang, C.; Diverres, G.; Karkee, M.; Zhang, Q.; Keller, M. Assessing grapevine water status through fusion of hyperspectral imaging and 3D point clouds. Comput. Electron. Agric. 2024, 226, 109488. [Google Scholar] [CrossRef]
  32. Zhu, H.; Chu, B.; Fan, Y.; Tao, X.; Xin, W.; He, Y. Hyperspectral Imaging for Predicting the Internal Quality of Kiwifruits Based on Variable Selectio Algorithms and Chemometric Models. Sci. Rep. 2017, 7, 7845. [Google Scholar] [CrossRef]
  33. Su, Z.; Zhang, C.; Yan, T.; Zhu, J.; Zeng, Y.; Lu, X.; Gao, P.; Feng, L.; He, L.; Fan, L. Application of hyperspectral imaging for maturity and soluble solids content determination of strawberry with deep learning approaches. Front. Plant Sci. 2021, 12, 736334. [Google Scholar] [CrossRef]
  34. Yang, B.; Gao, Y.; Yan, Q.; Qi, L.; Zhu, Y.; Wang, B. Estimation method of soluble solid content in peach based on deep features of hyperspectral imagery. Sensors 2020, 20, 5021. [Google Scholar] [CrossRef]
  35. Ma, T.; Li, X.; Inagaki, T.; Yang, H.; Tsuchikawa, S. Noncontact evaluation of soluble solids content in apples by near-infrared hyperspectral imaging. J. Food Eng. 2017, 213, 89–97. [Google Scholar] [CrossRef]
  36. Tian, Y.; Sun, J.; Zhou, X.; Yao, K.; Tang, N. Detection of soluble solid content in apples based on hyperspectral technology combined with deep learning algorithm. J. Food Process. Preserv. 2022, 46, e16414. [Google Scholar] [CrossRef]
  37. Çetin, N.; Karaman, K.; Kavuncuoğlu, E.; Yıldırım, B.; Jahanbakhshi, A. Using hyperspectral imaging technology and machine learning algorithms for assessing internal quality parameters of apple fruits. Chemom. Intell. Lab. Syst. 2022, 230, 104650. [Google Scholar] [CrossRef]
  38. Darnay, L.; Kralik, F.; Oros, G.; Koncz, A.; Firtha, F. Monitoring the effect of transglutaminase in semi-hard cheese during ripening by hyperspectral imaging. J. Food Eng. 2017, 196, 123–129. [Google Scholar] [CrossRef]
  39. Lim, J.; Kim, G.; Mo, C.; Kim, M.S.; Chao, K.; Qin, J.; Fu, X.; Baek, I.; Cho, B.K. Detection of melamine in milk powders using near infrared hyperspectral imaging combined with regression coefficient of partial least square regression model. Talanta 2016, 151, 183–191. [Google Scholar] [CrossRef]
  40. Specim, Spectral Imaging Ltd. Specim IQ: Portable Hyperspectral Imaging Camera. 2024. Available online: https://www.specim.com/iq/tech-specs/ (accessed on 23 July 2025).
  41. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  42. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  43. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  44. Wang, X.; Li, Y.; Xu, C.; Wang, Z.; Du, Q. Spectral-similarity-based kernels of SVM for hyperspectral image classification. Remote Sens. 2020, 12, 2154. [Google Scholar] [CrossRef]
  45. Nikzadfar, M.; Rashvand, M.; Zhang, H.; Shenfield, A.; Genovese, F.; Altieri, G.; Matera, A.; Tornese, I.; Laveglia, S.; Paterna, G.; et al. Hyperspectral Imaging Aiding Artificial Intelligence: A reliable approach for food qualification and safety. Appl. Sci. 2024, 14, 9821. [Google Scholar] [CrossRef]
Figure 1. Eight apple cultivars used in this study.
Figure 1. Eight apple cultivars used in this study.
Foods 14 03585 g001
Figure 2. Sample preparation (Stage A) and imaging setup (Stage B) used to acquire RGB and hyperspectral data (204 bands).
Figure 2. Sample preparation (Stage A) and imaging setup (Stage B) used to acquire RGB and hyperspectral data (204 bands).
Foods 14 03585 g002
Figure 3. Preprocessing stages of apple jam images.
Figure 3. Preprocessing stages of apple jam images.
Foods 14 03585 g003
Figure 4. Reflectance spectra (unitless, after white/dark reference calibration) of jam samples prepared from eight apple cultivars (Aport, Gala, Golden, Granny Smith, Idared, Prince, Semirenko, and Starcrimson) across eleven sugar concentration levels (25–75%). Data were acquired using Specim IQ in the 400–1000 nm range with 204 bands.
Figure 4. Reflectance spectra (unitless, after white/dark reference calibration) of jam samples prepared from eight apple cultivars (Aport, Gala, Golden, Granny Smith, Idared, Prince, Semirenko, and Starcrimson) across eleven sugar concentration levels (25–75%). Data were acquired using Specim IQ in the 400–1000 nm range with 204 bands.
Foods 14 03585 g004
Figure 5. Scatter plot of predicted versus actual sugar concentrations for the 5 × 5 grids test set using the 1D ResNet model. Each blue point represents one prediction, and the red points indicate the ground truth sugar concentration levels.
Figure 5. Scatter plot of predicted versus actual sugar concentrations for the 5 × 5 grids test set using the 1D ResNet model. Each blue point represents one prediction, and the red points indicate the ground truth sugar concentration levels.
Foods 14 03585 g005
Figure 6. MAE per class for the best-performing ResNet model under two data-splitting strategies: (a) cultivar-based splitting and (b) sugar concentration-based splitting.
Figure 6. MAE per class for the best-performing ResNet model under two data-splitting strategies: (a) cultivar-based splitting and (b) sugar concentration-based splitting.
Foods 14 03585 g006
Table 1. Comparative analysis of existing solutions.
Table 1. Comparative analysis of existing solutions.
LiteratureProductTypeDeep LearningClassical MLTarget Trait(s)
[29]PotatoRawSSC
[30]TomatoRawSSC, Firmness
[31]GrapevineRawWater status
[32]KiwiRawSSC, Firmness
[33]StrawberryRawSSC, Maturity
[34]PeachRawSSC
[35]AppleRawSSC
[36]AppleRawSSC
[37]AppleRawSSC, Firmness
[38]CheeseProcessedFat content
[39]Milk PowderProcessedMelamine
This workApple JamProcessedSSC
Table 2. Distribution of sugar concentration values across apple jam types.     = Test,     = Validation.
Table 2. Distribution of sugar concentration values across apple jam types.     = Test,     = Validation.
Apple Jam TypeSugar Concentrations in %
aport2530354045505560657075
gala2530354045505560657075
golden2530354045505560657075
granny2530354045505560657075
prince2530354045505560657075
idared2530354045505560657075
simirenko2530354045505560657075
starcrimson2530354045505560657075
Table 3. Performance of models trained on RGB images under concentration-based splitting across grid sizes.
Table 3. Performance of models trained on RGB images under concentration-based splitting across grid sizes.
ModelGrid SizeR2RMSEMAE
1D ResNet1 × 10.2213.9611.48
 2 × 20.1414.5911.84
 3 × 30.1114.8712.18
 4 × 40.0815.1012.24
 5 × 50.0615.2812.28
SVM1 × 10.1814.2711.81
 2 × 20.1414.6112.11
 3 × 30.1314.7112.24
 4 × 40.1214.8112.31
 5 × 50.1114.8312.32
XGBoost1 × 10.2113.9811.78
 2 × 20.1814.2911.87
 3 × 30.1714.3911.98
 4 × 40.1414.5812.13
 5 × 50.1414.6112.18
Table 4. Performance of 1D ResNet, SVM, and XGBoost across grid size configurations under cultivar-based splitting.
Table 4. Performance of 1D ResNet, SVM, and XGBoost across grid size configurations under cultivar-based splitting.
ModelGrid SizeR2RMSEMAE
1D ResNet1 × 10.924.473.60
 2 × 20.924.543.66
 3 × 30.904.183.31
 4 × 40.953.622.76
 5 × 50.943.733.01
SVM1 × 10.904.954.04
 2 × 20.905.134.17
 3 × 30.904.953.83
 4 × 40.943.943.11
 5 × 50.924.423.59
XGBoost1 × 10.846.274.71
 2 × 20.914.633.65
 3 × 30.924.423.50
 4 × 40.934.053.17
 5 × 50.924.373.44
Table 5. Performance of 1D ResNet, SVM, and XGBoost across grid size configurations under concentration-based splitting.
Table 5. Performance of 1D ResNet, SVM, and XGBoost across grid size configurations under concentration-based splitting.
ModelGrid SizeR2RMSEMAE
1D ResNet1 × 10.914.263.36
 2 × 20.924.023.08
 3 × 30.953.202.40
 4 × 40.953.132.42
 5 × 50.962.752.13
SVM1 × 10.962.982.40
 2 × 20.962.992.52
 3 × 30.962.982.47
 4 × 40.962.922.37
 5 × 50.962.922.39
XGBoost1 × 10.894.743.67
 2 × 20.894.723.65
 3 × 30.894.733.67
 4 × 40.894.763.70
 5 × 50.894.763.69
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lissovoy, D.; Zakeryanova, A.; Orazbayev, R.; Rakhimzhanova, T.; Lewis, M.; Varol, H.A.; Chan, M.-Y. Hyperspectral Imaging for Quality Assessment of Processed Foods: A Case Study on Sugar Content in Apple Jam. Foods 2025, 14, 3585. https://doi.org/10.3390/foods14213585

AMA Style

Lissovoy D, Zakeryanova A, Orazbayev R, Rakhimzhanova T, Lewis M, Varol HA, Chan M-Y. Hyperspectral Imaging for Quality Assessment of Processed Foods: A Case Study on Sugar Content in Apple Jam. Foods. 2025; 14(21):3585. https://doi.org/10.3390/foods14213585

Chicago/Turabian Style

Lissovoy, Danila, Alina Zakeryanova, Rustem Orazbayev, Tomiris Rakhimzhanova, Michael Lewis, Huseyin Atakan Varol, and Mei-Yen Chan. 2025. "Hyperspectral Imaging for Quality Assessment of Processed Foods: A Case Study on Sugar Content in Apple Jam" Foods 14, no. 21: 3585. https://doi.org/10.3390/foods14213585

APA Style

Lissovoy, D., Zakeryanova, A., Orazbayev, R., Rakhimzhanova, T., Lewis, M., Varol, H. A., & Chan, M.-Y. (2025). Hyperspectral Imaging for Quality Assessment of Processed Foods: A Case Study on Sugar Content in Apple Jam. Foods, 14(21), 3585. https://doi.org/10.3390/foods14213585

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop