Hyperspectral imaging (HSI) combines spectroscopy and optical imaging and provides information about the chemical properties of a material and its spatial distribution [1
]. HSI is a form of non-invasive imaging that applies visible and near-infrared radiation (wavelengths 400 nm to 2500 nm) to chemicals or biological substances to measure differential reflection [2
]. Due to the vast amount of information to be obtained from hyperspectral images—compared to images in the RGB (red, green, blue) color model—HSI has been widely applied in research and industry. Applications include rapid, environmentally friendly, and noninvasive analysis in remote sensing [3
], biodiversity monitoring [5
], health care [6
], wood characterization [7
], and the food industry [8
However, despite the potential benefits, the wide application of HSI is restrained due to the considerable costs of high-quality imaging devices compared to conventional RGB sensors. Moreover, most of these HSI devices are scanning-based—using either push broom or filter scanning approaches—making them less portable and time consuming to operate, which seriously limits the broader application of HSI technology [10
]. In addition, snapshot hyperspectral cameras—able to take images quickly—often feature a rather low spatial resolution [11
]. High-resolution hyperspectral information is appealing as it not only provides spectral signatures of chemical elements but also spatial details [12
Deep learning approaches are increasingly applied in many areas of research and industry [13
], and recently allowed the development of hyperspectral image reconstruction approaches [16
]. The reconstruction of hyperspectral information from RGB images is envisioned to provide a promising way of overcoming current limitations of both scanner- and snapshot camera-based hyperspectral imaging devices, providing image with both high spatial and spectral resolution, and being affordable, user friendly, and highly portable [17
]. In particular, smartphone camera sensors could easily capture images in high spatial resolution, e.g., twelve million pixels per image, providing a sound basis for reconstructing high-resolution hyperspectral images. While reconstruction approaches were initially very rigid and complex [18
]—limiting their usability for practical application—recent progress, in particular the application of deep learning approaches, enabled easier, faster, and more accurate hyperspectral image reconstruction pipelines [16
]. Several contrasting approaches based on deep learning have been proposed recently [20
While hyperspectral recovery from a single RGB image has seen a great improvement with the development of deep learning, it is still limited for several reasons. For example, hyperspectral images used during method development were previously restricted to the visual spectral range (VIS, 400–700 nm) with 31 wavebands and a spectral resolution of 10 nm [18
]. Compared to the near-infrared range (NIR; 800 to 2500 nm), images in the visual range miss information important for many applications [23
]. In addition, considerable uncertainty exists on criteria for model performance evaluation. Currently three major evaluation metrics are widely used in performance assessment: Mean Relative Absolute Error (MRAEEM
), Root Mean Square Error (RMSE), and Spectral Angle Mapper (SAM) [17
]. However, there is no general agreement over which criterion is most robust for indicating a better model.
A key application of HSI is food quality evaluation [26
]. Tomato is one of the most important fruits for daily consumption, and the fast and non-destructive evaluation of its quality is of great interest both in research and industry—rendering it a suitable object for a case study [27
]. Taste of different tomato varieties and qualities is mainly affected by sugar content, acidity, and the ratio between them [29
]. Previous studies used diverse instruments such as a Raman spectrometer [30
], near-infrared spectrophotometers [31
] and a multichannel hyperspectral imaging instrument [33
] for quantifying those parameters. The normalized anthocyanin index (NAI) has been shown to be very effective in predicting lycopene content [34
]. Lycopene, a secondary plant compound of the carotenoid class, may reduce the risk of developing several cancer types and coronary heart diseases [36
]. Making use of readily available RGB cameras, e.g., smartphone cameras, in combination with hyperspectral image reconstruction techniques would greatly facilitate the assessment of tomato quality parameters. In particular, it will promote the selection and sorting process of tomato fruits in industry [37
] and might even support consumers in the choice of tomato qualities.
In this study, we demonstrate the use of a permutation test to select an appropriate state-of-the-art deep learning model for hyperspectral image reconstruction from a single RGB image. Subsequently, we show that the reconstructed images can be used to predict tomato quality properties through random forests (RF) regression at high accuracies—developing an efficient pipeline from automatic segmentation to quality assessment. Finally, the application potential of reconstructed hyperspectral image is discussed.
HSI reconstruction has become popular and opened a new field for low-cost methods of acquiring hyperspectral information in high resolution, both spatial and spectral. Even though there has been some research developing methods for reconstructing hyperspectral images [17
], real world applications of these methods are still lacking [10
]. This study has demonstrated the potential of using the HSCNN-R model for hyperspectral reconstruction in the visual near-infrared range to predict key quality parameters of tomato. Three models can be selected based on three evaluation metrics, MRAEEM
, RMSE, and SAM, as their corresponding minimum values did not appear in the same model, as has been found previously [17
]. A single lower value from any one of these three evaluation metrics thus cannot indicate a better model performance. As only the minimum errors reconstructed from models with MRAELF
loss function and MRAEEM
evaluation metric were found to be significantly lower than ones with MSE loss function, it can be concluded that these models were consistently superior in spectral reflectance reconstruction compared with models with other loss function and evaluation metric combinations. This was also found by Shi et al. [17
] as MRAELF
loss function was more robust to outliers and treated wavebands of the whole spectra with different illumination levels were more similar compared with the MSE loss function. As these loss functions can also be purpose specific, MRAELF
loss function should thus be chosen if all wavebands are equally prioritized for better exploration of the whole spectra, while MSE should be preferred if the highly illuminated spectral reflectance is of greater interest.
Models trained with the MRAELF
loss function were able to converge with fewer epochs at higher speed, and reached lower errors compared to the MSE loss function, which is beneficial for hyperspectral image reconstruction as model training is expected in practice to be implemented in real time prediction, e.g., sorting tomato based on lycopene content on the conveyor belt. The increase of validation error after reaching the minimum value, for either MRAEEM
, RMSE, or SAM, was mainly due to overfitting on a small training dataset [53
To further confirm the robustness of the selected model in reconstructing hyperspectral images, reconstructed spectral reflectance of RGB images either rendered from hyperspectral images or directly captured by smartphone camera, were compared with directly measured spectral reflectance from the hyperspectral camera (“ground truth”). The similarity of both the reflectance and corresponding 1st
derivative demonstrated that the selected approach resulted in a reliable reconstruction of the spectral pattern. As the RGB image used for training was rendered from hyperspectral images by a standard CIE matching function while smartphone RGB sensors have different spectral sensitivity functions likely deviating from CIE [22
], an increase of errors during the reconstruction of hyperspectral images from RGB smartphone sensors might be expected. However, although the RGB images from the smartphone were completely new to the trained model, the reconstruction results demonstrated the soundness of the model in recovering the spectral reflectance even from regular RGB images taken by a standard smartphone model.
The very high R2
value in NAI prediction showed that the reconstructed hyperspectral image was suitable for predicting tomato lycopene content non-destructively based on the RGB images of intact tomatoes. NAI is the indicator for lycopene which can be closely reflected by the color of tomato [54
]. Tomato color change from green to red due to the degradation of chlorophyll while accumulating lycopene during development [55
]. The high prediction accuracy for NAI via reconstructed hyperspectral image for lycopene is probably also related to the large range of color change in the corresponding tomato sample. Both TTA and SSC were less precisely predicted; however, their ratio STR was predicted with a high R2
value and very high significance in F test, which agrees with earlier findings [35
]. The higher precision of reconstructed hyperspectral images in predicting STR values is fortunate as it is also more informative compared with either TTA or SSC values alone—tomato flavor is determined mainly by the ratio of sugar and acidity rather than the two separate properties [29
]. Overall, the high accuracy in tomato quality prediction highlights the robustness and potential of hyperspectral image reconstruction.
The good to very good performance, both in reconstructing hyperspectral images from model unseen RGB images from smartphone camera and in predicting tomato quality parameters at moderate to high accuracies, makes the HSCNN-R model an important tool for future imaging applications. Hyperspectral reconstruction from a single RGB image makes HSI application mobile and low cost, and allows for easy implementation through either a cloud service or an app. With the selected model and trained weights, we can now generate hyperspectral images of tomatoes of the same variety at least with consumer level cameras and explore the other hyperspectral properties of interest—as the example on tomato quality predicted from smartphone images illustrates in this study (Figure 5
). Specifically, this provides huge benefits for tomato research and industry, and potentially also for other fruit crops, such as cucumber and apple. Even though it is possible to predict the STR of each pixel of tomato image directly without reconstructing the whole spectrum from 400 to 1000 nm, the fully reconstructed spectral reflectance provides opportunities to explore other hyperspectral properties through different machine learning algorithms, offering much higher flexibility. Important bands can even be selected to reduce workload while improving prediction accuracy, as is commonly used in HSI analysis [56
This study first demonstrates the use of hyperspectral image reconstruction from a single RGB image for a real-world application—using tomato fruits as an example. The capability of HSCNN-R for spectral reconstruction beyond the visual-range towards near-infrared was demonstrated. The reconstructed hyperspectral images from RGB images of tomato were able to estimate important tomato quality parameters with high accuracy.
Hyperspectral image reconstruction could be a promising approach for a range of other fields, thereby further developing its full potential. With HSCNN-R, we can potentially reconstruct hyperspectral images in both higher spatial and spectral resolution at much lower costs. However, the reconstructing model built on tomatoes can probably not be transferred easily to other categories, e.g., determination of chemical properties of other fruits, soils or rocks, or to different light conditions—which can be an obstacle for extending the application range. Thus, libraries containing hyperspectral images in different categories of interest (fruits of various varieties, and at different harvest stages and growing conditions (incl. stress), soil types, wood, skin etc.) should be built for training models to fit each category specifically or developing a general model that fits more categories.
Future advancement in this field should particularly focus on: (1) exploration of more robust models for hyperspectral image reconstruction in various illumination conditions, and (2) extending the application field using current state of the art models and building libraries for a wider range of objects as exemplified above. Thereby it can be expected that higher resolution hyperspectral images will be more accessible for a range of real-world applications in the future.