1. Introduction
Watermelon is native to South Africa, and is a fruit that is consumed daily for its nutritious properties and sweet and crispy flavor, as well as other characteristics that people love. Watermelon pulp is rich in nutrients such as protein, carbohydrates, dietary fiber, potassium, phosphorus, calcium, iron, and sodium. The seeds, on the other hand, contain aromatic oils and can be used for fried food [
1]. The moisture content of watermelon seeds is a critical parameter for evaluating seed quality, as it directly influences both seed yield and overall quality [
2]. When the water content is too high, seed respiration is increased and microorganisms are allowed to multiply, causing mold to develop and affecting germination. When the water content is too low, it causes the cells to die from lack of water and the seeds become inactive [
3]. Only a certain range of moisture content is favorable for seed storage and germination rate; excessive or insufficient moisture content can impair the metabolic activity and viability of the seed [
4]. Hence, accurate moisture content measurement is crucial for ensuring the optimal storage conditions and maximizing the germination potential of watermelon seeds.
Currently, there are many researchers using various chemical and physical methods for detecting the moisture content of watermelon seeds, including Karl Fischer titration and weight-drying. Karl Fischer titration determines the moisture content of a sample by reacting iodine and sulfur dioxide with water, depending on the amount of iodine consumed by the water [
5]. The weight-drying method involves placing the seeds in a temperature-controlled drying oven, waiting for the weight of the seeds to stop changing before removing the seeds, and then determining the moisture content based on the weight of the seeds lost [
6]. Although these methods effectively assess the hydration level of watermelon seeds, they can destroy the seed activity, which can lead to complicated and time-consuming work if the watermelon seeds are in large batches. In recent years, scholars have applied a variety of non-destructive testing techniques for seed quality assessment, including methods such as near-infrared spectroscopic analysis, thermography, and X-ray inspection [
7]. Near-infrared (NIR) spectroscopy surpasses other optical sensing techniques in both cost-effectiveness and detection speed, making it a widely adopted method for assessing seed quality, such as by detecting moisture and protein in soybean [
8], the vigor of maize seeds [
9], and the moisture content of cherry seeds [
10], among others. However, conventional NIR spectroscopy methods are limited to acquiring spectral data from a single point on the surface of the sample, and cannot obtain comprehensive information on seed quality.
In order to solve the limitations of the above detection techniques, this study proposes to detect the water content of watermelon seeds by a hyperspectral non-destructive testing technique. Hyperspectral imaging technology utilizes a hyperspectral imager to capture the spatial imagery of the target object across various spectral bands. Depending on the specific imaging modalities employed, it can be divided into hyperspectral reflectance imaging and transmittance imaging. Zhang et al. [
11] used near-infrared spectroscopy and hyperspectral reflectance imaging to predict the moisture content of maize seeds. Comparative analyses showed that the PLSR model using hyperspectral reflectance data was superior to the PLSR model based on near-infrared spectroscopy. Sun et al. [
12] used hyperspectral reflection imaging (HRI) to assess the moisture content of barley seeds. SVR and PLSR models were constructed based on full-band spectral and feature wavelength spectral data, respectively. The results showed that the SVR model constructed using the feature wavelengths selected by the successive projection algorithm (SPA) yielded the most accurate results, with a coefficient of determination of prediction (
) of 0.883 and a root mean square error of prediction (RMSEP) of 0.0198%. Lu et al. [
13] used hyperspectral reflection imaging combined with a support vector regression (SVR) algorithm to predict the moisture content of rice seeds. The results showed that the SVR model optimized using the Simulated Annealing Genetic Algorithm (SAGA) performed best, with a coefficient of determination of prediction (
) of 0.8892 and root mean square error of prediction (RMSEP) of 0.0296 after clustering feature wavelengths selected by the successive projection algorithm. Although hyperspectral reflection technology shows great potential in assessing seed moisture content, the reflection spectral data can only provide surface information on the samples, and the accuracy of the model’s prediction of moisture content needs to be improved. To address the limitations of single reflection spectral data, hyperspectral reflection combining transmission imaging techniques and data fusion methods is proposed to improve the accuracy and robustness of watermelon seed moisture content prediction in this study.
In summary, the transmission spectra were added to the reflection spectra, and data fusion of both reflection and transmission spectra was conducted to identify the most effective data type for accurately predicting the hydration level of watermelon seeds in this study. The specific research objectives are as follows: (1) PLSR and LSSVR models were developed to quantify the moisture content of watermelon seeds based on various preprocessed reflectance and transmission spectral data, respectively; (2) the various preprocessed full-band reflectance and transmission spectral data were combined and modeled again for PLSR and LSSVR; (3) critical wavelengths of reflectance and transmission spectra were extracted by the CARS and UVE algorithms to enhance the prediction of the watermelon seed moisture content model using the intermediate fusion strategy.
4. Conclusions
During the course of this study being conducted, hyperspectral imaging was employed to assess the moisture content of watermelon seeds. To enhance the accuracy of moisture content prediction, the data fusion technique incorporating both reflection and transmission spectral data was utilized. The key results can be summarized as follows:
- (1)
In a model that is founded on a solitary reflection and transmission spectrum, the model using transmission spectral data outperformed the reflection data. Among the models founded on a solitary reflection or transmission spectrum, the LSSVR model founded on the original transmission spectrum showed the best results, with an and RMSEP of 0.8654 and 0.0182, respectively.
- (2)
By applying primary data fusion relying on reflection and transmission spectra, the model’s performance can be improved. The best predictive model for primary data fusion was Baseline–LSSVR, with an and RMSEP of 0.8875 and 0.0166, respectively. Models based on primary data fusion predict better results compared with the models based on single spectral data; this model was upgraded by 2.55%.
- (3)
The performance of the prediction model based on the feature wavelengths screened by the CARS and UVE algorithms for band screening was improved. The best prediction model was the T-Raw-CARS-LSSVR model, with an and RMSEP of 0.8914 and 0.0163, respectively. The prediction was improved by 3.00% compared with the model for the full-spectrum data.
- (4)
The CARS algorithm proved to be a more effective wavelength selection method than the UVE algorithm. Intermediate data fusion, using feature wavelengths selected by CARS from reflection and transmission spectra, optimized model predictions. The RAW-CRAS-LSSVR model achieved the most accurate prediction, with an and RMSEP of 0.9149 and 0.0144, respectively. The LSSVR model constructed using intermediate fused spectral data demonstrated the most substantial enhancement in predicting the moisture content of watermelon seeds. Its prediction accuracy increased by 5.72% when compared to the model developed from single full-spectrum data.
This research demonstrates that the integration of hyperspectral imaging and a data fusion approach is capable of precisely assessing the moisture content of watermelon seeds. Compared to a single spectrum, the intermediate fused data contain more sample information about watermelon seeds, resulting in a significant improvement in the stability and accuracy of the model they create. Such a finding lays a robust theoretical groundwork for the application of hyperspectral imaging in the evaluation of seed quality.