1. Introduction
Adulteration in agriculture and food products is an essential safety and control area requiring rapid, accurate, robust, and automated methods for detecting, identifying, and quantifying adulteration, including coconut milk products. Coconut milk is generally extracted from grated coconut meat after pressing or squeezing with or without the addition of water. Coconut milk has been used as a major ingredient in several cuisines, such as curries and desserts [
1]. There are two common reasons for adulteration in coconut milk products. The first reason is to increase production volume and reduce costs by adding tap water or old coconut water to coconut milk. The second reason is an attempt to boost the apparent carbohydrate content by adding corn flour.
Accurately identifying adulterants is important for controlling coconut milk product adulteration. The main content of coconut milk (moisture, total fat, carbohydrates, protein, and ash) will be changed when mixed with other materials. As reported by Lakshanasomya et al. [
2], laboratory testing can measure the total solids and total fat in coconut milk by drying it in a hot air oven or using a vacuum oven, which takes more than 2 h to prepare one sample. Although accurate, this method is time-consuming and requires complicated sample pre-treatments and well-trained technicians, so it cannot be relied on to carry out rapid monitoring. Near-infrared (NIR) and mid-infrared spectroscopy have gained considerable interest among the approaches to physical properties, particularly for detecting adulteration in many agricultural and food products. Compared with the above methods, NIR and mid-infrared spectroscopy are analytical techniques with the advantages of rapid response in real time, simplicity in testing, and are non-destructive. However, this method requires the development of a calibration model before it can be used to make predictions.
Several efforts to develop calibration models have been created using a chemometrics approach to achieve better performance prediction of coconut milk adulteration based on NIR spectroscopy. For instance, Azlin-Hashim et al. [
3] employed partial least squares (PLS) regression to quantitatively determine the concentration of corn flour in the coconut milk using an FT-IR spectrometer. This study used spectroscopic techniques in mid-infrared zones combined with classical chemometrics. Although advantageous, classical chemometric analysis is frequently criticized for its requirement of expertise and subjectivity in elaborating spectral data, including selecting an excellent preprocessing method based on what worked well on a previous data set and how to highlight important spectral regions [
4,
5]. Therefore, Sitorus and Lapcharoensuk [
6] adopted a machine-learning algorithm with automatic preprocessing to predict water in coconut milk using an FT-NIR spectrometer. Al-Awadhi and Deshmukh [
7] utilized linear discriminant analysis (LDA) and K-nearest neighbors (KNN) from machine learning as a classifier to detect water in coconut milk using an FTIR spectrometer. They succeeded in improving model accuracy but were observed to be complicated structures that were difficult to train and with apparent risks of over-fitting. Moreover, although robust and accurate, these strategies have drawbacks related to data dimensionality and higher entropy apart from efforts and practical feasibility [
5,
8,
9]. Furthermore, efforts related to learning representations of the data that identify and highlight the underlying explanatory factors hidden in the data are still challenging in machine-learning applications [
10]. Consequently, some studies are probing for a shift in the paradigm toward applying deep learning to resolve the issues related to classical and feed-forward neural network approaches.
Deep learning is a branch of machine learning that begins with images as input and learns to identify patterns within their spatial dimensions. Deep learning consists of multiple processing layers to automatically learn complex representations from data without introducing hand-coded rules or human domain knowledge. Among deep-learning algorithms, convolutional neural networks (CNNs) are presently one of the most trending models since they do not require manual feature extraction and have several network architecture types. CNNs are constructed with a series of convolutional layers that act as feature extractors, followed by fully connected layers at the end of the network that serve as predictors. For processing NIR spectral data, it was also recently seen to be useful for one-dimensional (1D) spectroscopy data, as well as for regression tasks wherein this supervised approach could perform both feature extraction and learning related to features of interest [
8]. Presently, CNN techniques are developing rapidly so many network architecture variants are found for various analysis purposes, such as AlexNet, ResNET, GoogLeNet, etc. [
11]. Furthermore, in the case of chemometrics data, the use of CNN can enable training on smaller weights, thereby lowering data complexity as opposed to fully connected or feed-forward neural networks. Some of the advantages of CNN are a reduction in neuron interdependence, adaptability to datasets beyond the training, and reduced risk of over-fitting, which is a common criticism in feed-forward networks. Moreover, some researchers [
4,
8] note that CNN can eventually simplify preprocessing and model development, thereby reducing the complexity of model development and improving accurate and robust model predictions. Recent papers on the utilization of CNN for NIR spectroscopy, especially in agricultural and food products, have reported adulteration in coffee products (Nallan Chakravartula et al. [
4]), adulteration in infant formula products (Liu et al. [
12]), adulteration in dairy products (Said et al. [
13]), and adulteration in minced beef products (Weng et al. [
14]). This shows the increasing number of studies using the CNN algorithm in NIR-based adulteration detection for agricultural and food products.
To the best of our knowledge, even though coconut milk adulteration was investigated with another adulterant material and another type of spectroscopy [
3,
6], no study explored deep learning as advanced computational algorithms for quantifying adulterants by two types of NIR spectroscopy, including benchtop FT-NIR and portable Micro-NIR and by two types of solid adulterants, including corn flour and tapioca starch. Therefore, the objective of this study was to bridge the gap between advanced perceptual sensors from NIR spectroscopy and data science by developing and testing the performance of four types of regressor architecture CNN of deep-learning to detect coconut milk adulteration from corn flour and tapioca starch.
4. Discussion
This study explored the feasibility of using deep learning to create a rapid, accurate, and robust prediction model to predict adulteration levels of coconut milk by corn flour and tapioca starch using FT-NIR and Micro-NIR spectroscopy. Presently, the non-destructive testing of adulteration in agriculture and food products by NIR spectrometer based on laboratory conditions has been widely introduced. However, the procedures and development of the calibration model under which it can be applied are still limited. The use of deep learning can be a good solution to such problems. Additionally, compared with the results of this study’s use of deep learning, for example, coffee adulteration prediction [
4], adulteration in infant formula [
12], cow milk fat content adulteration by water [
13], and minced beef adulteration [
14], we also obtained equally superior prediction results. Furthermore, the operation of the NIR spectrophotometer is simpler and easier to promote.
This article presents a novel deep-learning regression method for quantitative NIR spectrum analysis. This method utilizes four network architectures: Simple convolutional neural network (Simple CNN); S-AlexNET; ResNET; and GoogleNET. However, as is known, a robust NIR model for adulteration detection is hard to achieve due to multiple variation factors, such as different brands and batches of product, the simultaneous existence of several adulterants, temperature, humidity, and spectral drift of light sources, making it hard to obtain stable applications in practice. Therefore, more advanced modeling investigations should be carried out and prepared to evaluate and improve the robustness of the proposed method for the future. However, the limitations of the proposed method should also be further considered and improved. For example, the deep-learning method is much more time-consuming in training than the traditional method and the regular machine-learning algorithm. Therefore, some adulteration studies in food and agriculture products are based on NIR spectroscopy run deep-learning algorithms on graphics processing units (GPU), such as the assurance of tea quality by Yang et al. [
17] and detection of adulteration of minced beef by Weng et al. [
14]. However, thanks to the fast development of deep-learning hardware, for instance, graphics processing unit (GPU), associative processing unit (APU), tensor processing unit (TPU), and quantum processing unit (QPU), the testing time for the proposed network is acceptable.
As can be observed from
Figure 2, the spectral profiles of the degree of coconut milk adulterants by corn flour and tapioca starch were similar and characterized by few substantial differences in peak positions and curve trends. In general, for all the sample adulterants, a few characteristic overlapping peaks contributed by the presence of the main content of coconut milk and adulterant material, including fat, protein, moisture, ash, and carbohydrates. Samples with more adulterant material caused the peak absorbance level to decrease, both for adulteration by corn flour and tapioca starch. This corresponds to the difference in moisture content between coconut milk and adulterant, which causes the free moisture content in the coconut milk to be absorbed by the admixture agent to reach an equilibrium point. As a result, coconut milk that has been adulterated more with a solid adulterant material has a lower absorption spectral ability. This is in line with the report by Malvandi et al. [
23], who stated in their study that peak values and their corresponding wavelengths in the NIR region changed as the moisture content altered. Büning-Pfaue [
24] emphasized that the strength and weakness of this absorption band come from the strong effect of hydrogen bonds on organic monomers, ions, and polymers in the sample. The presence of content in adulteration coconut milk samples was observed at the main peaks of the following wavebands, both FT-NIR and Micro-NIR: 1210 nm (8262 cm
−1) related predominantly to CH bond stretching with the second overtone; 1453 nm (6881 cm
−1) to the first overtone of OH stretching bonds attributed to starch and water; 1728 nm (5786 cm
−1) and 1764 nm (5670 cm
−1) to the resonance bands of CH bond stretching with the first overtone; and 1929 nm (5184 cm
−1) to the CH bonds stretching with the second overtone [
25,
26,
27].
The performance criteria for the prediction model using deep learning in this study were evaluated based on predicting grain chemical composition content. A study by Chu et al. [
22] examined the regression model’s capacity to classify RPD in the following manner: less than 3 as a poor or unreliable model, 3.1–4.9 as a fair model, 5.0–6.4 as a good model, 6.5–8.0 as a very good model and more than 8.1 as an excellent model. When comparing the RPD results for the prediction degree of adulteration of coconut milk with corn flour using all the network architectures, it was observed that Micro-NIR was superior to using FT-NIR (for all network architectures). At the same time, all the network architectures were considered excellent models. For tapioca starch in coconut milk case, Micro-NIR performed better than FT-NIR based on RPD among the spectrophotometer to predict the degree of adulteration (
Table 6). Subsequently, only ResNET has lower RPD and weak performance (FT-NIR data set) among the network architectures. Regarding the comparison between FT-NIR and Micro-NIR, the models developed for the prediction of coconut milk by adulterant material corn flour and tapioca starch seem to give comparable results using the FT-NIR. The performance of FT-NIR slightly reduced RPD, perhaps due to some factors, including a lack of explanatory variables and collinearity, but fortunately, the RPD obtained is still higher than eight [
28,
29].
In chemometrics, a limited number of samples with high-dimensional data of features pose common problems like data overfitting and multicollinearity and do not show the main features that are more dominant in the data. Selection of the most important features can lead to the dominant variables in a high-dimensional dataset. In case studies on NIR spectra, this can be represented in many methods, one being by expressing slope coefficients or regression coefficients. According to a study from Palermo et al. [
30], regression coefficients can be used to select appropriate predictors according to the magnitude of their absolute values. Even according to a study by Wold et al. [
31], in classical chemometric analysis using partial least squares (PLS), small regression coefficients can be ignored as an unimportant term to find the most prominent features and correlate them with the chemical assignment of some structure and bond vibration in the NIR spectroscopy. Additionally, compared with the results of previous research using this approach in analysis in NIR spectroscopy, for example, extra virgin olive oil adulteration prediction by PLS regressor [
32], adulteration in quinoa flour by PLS regressor [
33], aged-rice adulteration by competitive adaptive reweighted sampling (CARS) combined with PLS regressor [
34], and adulterants of notoginseng powder by CARS-PLS regressor [
35]. However, in applying advanced chemometrics using machine learning and deep learning, it is still a challenge to demonstrate coefficients that can represent important features.
The regression coefficients from deep-learning algorithms used in this study can be represented using weight coefficients. Even though it is not strictly identical to the regression coefficients in classical chemometric analysis using PLS, at least the weight coefficients of each deep-learning network architecture can indicate variables for each response that are more important than others. Regression coefficients for the case of deep-learning regressors were first introduced by Cui and Fearn [
8] and tested on three NIR datasets, including the wheat flour dataset, wheat flour dataset, and protein content dataset. In their study, they randomly draw a few spectra from the dataset and plot the corresponding regression coefficients. This is understandable because deep learning is a non-linear approach, so each spectrum will have its own regression coefficient value, different from the PLS regression coefficient, which has the same value for all sample spectra. However, in this study, because the aim of showing the coefficients of weight of each deep-learning network architecture is to find the dominant features in high-dimensional data, we use all training data spectra. Next, we average the weight coefficients of all the training data spectra, which are called regression coefficients for each deep-learning network architecture. In this study, we apply a threshold score of 50% of the maximum and minimum peaks in the regression coefficient, as shown in
Figure 5,
Figure 7,
Figure 9 and
Figure 11. This approach is similar to the system applied in the variable importance in projection (VIP) approach from PLS regression, which applies a threshold score rule that can be data specific, ranging between 0.83 and 1.21 [
33,
36].
In the case of scanning corn flour in coconut milk (
Figure 5 and
Figure 9), we can see regression coefficients related to the presence of the structural groups CH and OH. In general, the regression coefficients in this study are in the range of 1200–1500 nm (8333–6667 cm
−1), which is related to the main wavelength of corn flour found by Jiang and Lu [
37]. In the case of scanning with FT-NIR, we can see regression coefficients that overlap with all deep-learning network architectures, at least across nine NIR bands. This starts from wave 7421 cm
−1 (1348 nm), which is related to the fourth overtone of CH
2 [
25]. Next, waves at 7306–7329 cm
−1 (1369–1364 nm) and 7344 cm
−1 (1362 nm) are a combination of CH stretching and CH deformation from CH
3 [
26]. Waves at 7213–7228 cm
−1 (1386–1384 nm) and 7236–7244 cm
−1 (1382–1380 nm) correspond to OH stretching from H
2O [
27]. Furthermore, waves at 7167–7190 cm
−1 (1395–1391 nm) and 7182 cm
−1 (1392 nm) are related to a combination of CH stretching and CH deformation from CH
2 [
26]. Lastly, the wave at 4536–4544 cm
−1 (2205–2201 nm) is related to CH stretching and C=O stretching from CHO [
26]. However, the regression coefficients that overlap with all deep-learning network architectures that scan using Micro-NIR are 12 NIR bands. The wave was detected starting from 1205–1206 nm (8299–8292 cm
−1), the fourth overtone of aromatic CH, to 1212 nm (8251 cm
−1) and 1224 nm (8170 cm
−1), the second overtone of CH
2 and CH [
25,
26]. In addition, waves in the range of 1249–1274 nm (8006–7849 cm
−1), 1342–1348 nm (7452–7418 cm
−1), and 1354–1404 nm (7386–7123 cm
−1) are the fourth overtone beta-diketone, the fourth overtone CH
2, and the third overtone aldehydes, respectively [
25]. Furthermore, waves at 1391 nm (7189 cm
−1), 1404–1410 nm (7123–7092 cm
−1), and 1416–1422 nm (7062–7032 cm
−1) are the representations of combination CH stretching with CH deformation, the first overtone of OH stretching, and a combination CH stretching with CH deformation and the first overtone OH stretching, respectively [
26]. Finally, the waves at 1515 nm (6601 cm
−1), 1540 nm (6494 cm
−1), and 1614–1621 nm (6196–6169 cm
−1) are related to the first overtone of CH, the first overtone of OH (starch), and the first overtone of =CH
2, respectively [
25,
26].
When examining the tapioca starch in coconut milk (
Figure 7 and
Figure 11), we see regression coefficients associated with the existence of the structural groups CH, CC, CNO, and OH. The structural groups detected in this sample were relatively slightly different from the adulteration of coconut milk with corn flour. This is due to the composition of the adulterant material, which is also different. The study by Williams [
38] was confirmed by Phetpan and Sirisomboon [
39], who stated that the peak in the 1400 nm (7143 cm
−1) region was associated with the glucose molecules in the tapioca starch constituents. The regression coefficients that cover the overlap for all deep learning network architectures when using FT-NIR spectroscopy are in the five NIR spectral bands. Starting from waves 7213–7221 cm
−1 (1386–1385 nm), 7182–7190 cm
−1 (1392–1391 nm), 6380–6403 cm
−1 (1567–1562 nm), 5963–5979 cm
−1 (1677–1673 nm), and 4621–4636 cm
−1 (2164–2157 nm), which correspond to the third overtone carbonyl stretching, CH
2 combination stretching and deformation, the second overtone CC stretching, the first overtone aromatic CH stretching, and the second overtone CNO, respectively [
25,
26]. Furthermore, the regression coefficients that cover the overlap for all deep-learning network architectures when using Micro-NIR spectroscopy are in the 11 NIR spectral bands. Starting from waves 1255–1276 nm (7968–7837 cm
−1), 1360–1367 nm (7353–7315 cm
−1), 1379–1385 nm (7252–7220 cm
−1), 1404 nm (7123 cm
−1), and 1410 nm (7092 cm
−1), which correspond to the third overtone CC stretching, combination stretching, and deformation from CH
3, the third overtone carbonyl stretching, the third overtone carbonates, and the first overtone of OH stretching, respectively [
25,
26]. Next, waves at 1416 nm (7062 cm
−1), 1428 nm (7003 cm
−1), 1552–1553 nm (6443–6882 cm
−1), 1614–1621 nm (6196–6169 cm
−1), and 1627–1633 nm (6146–6124 cm
−1) are related to combination stretching and deformation of CH
2, the first overtone of NH stretching, the third overtone carbonyl stretch, the first overtone of OH stretching, the first overtone of =CH
2 stretching, and the first overtone of CH stretching, respectively [
25,
26].
This study analyzed important wavelengths by deep learning and found that not all important wavelengths will be the same for all deep-learning network architectures. In addition, even though the FT-NIR wavelength range covers the wavelength range in Micro-NIR, the important wavelength will not be precisely the same for both. However, in the case of FT-NIR and Micro-NIR instruments, it was still found that some important waves overlapped between them. In the case of corn flour, waves 7421 cm
−1 (1348 nm) and 7167–7190 cm
−1 (1395–1391 nm) were found in FT-NIR, and 1342–1348 nm (7452–7418 cm
−1) and 1391 nm (7189 cm
−1) were found in Micro-NIR. In the case of tapioca starch, the waves at 7213–7221 cm
−1 (1386–1385 nm) were found in FT-NIR and 1379–1385 nm (7252–7220 cm
−1) in Micro-NIR. This may be caused by the nature of each regressor, which in the convolutional layer stage can transform the spectra to fit in the following regression scheme. In other words, the regressor from deep learning has carried out automatic preprocessing, as reported by Cui and Fearn [
8]. This causes the final shape of each spectrum before the “flatten” to the “dense fully connected” stage to differ according to the output variable. This difference will eventually result in differences in important wavelengths for each deep-learning network architecture. Even though they are different, several wavelengths from all deep-learning network architectures are still the same.
5. Conclusions
Deep learning as a novel approach to predict the level of adulteration of coconut milk was successfully developed and tested based on spectra from benchtop FT-NIR and portable Micro-NIR. Models based on FT-NIR spectroscopy to be able to predict the adulteration level of corn flour in coconut milk (1–50%) can be generated using architecture network regressor from deep learning (Simple CNN, S-AlexNET, ResNET, GoogleNET) in the performance ranges of R2, RMSE, and Bias at their training from 0.996 to 0.999, from 0.370 to 0.958%, and from −0.027 to 0.120, respectively. Next, R2, RMSE, Bias, and RPD at the testing stage are from 0.992 to 0.998, 0.686 to 1.256%, from −0.012 to 0.176, and from 11.429 to 20.866, respectively. Even though it is still as good, the performance based on the FT-NIR prediction model is still lower than that of Micro-NIR with the same regressor network architecture from deep learning. Performance ranges R2, RMSE, and Bias at their training using Micro-NIR are from 0.998 to 0.999, from 0.363 to 0.706%, and from −0.053 to −0.183, respectively. At the testing stage, R2, RMSE, Bias, and RPD are from 0.998 to 0.999, from 0.463 to 0.597%, from −0.023 to 0.123, and from 23.981 to 31.094, respectively.
Relatively similar to the case of the model to predict tapioca starch adulteration in coconut milk, the performance based on the Micro-NIR dataset is better than using FT-NIR with the same regressor network architecture from deep learning. Performance ranges at their training (R2, RMSE, Bias) and testing (R2, RMSE, Bias, RPD) using Micro-NIR are from 0.998 to 1.000, from 0.298 to 0.637%, from −0.029 to −0.111, from 0.998 to 0.999, from 0.370 to 0.611%, from −0.035 to −0.068, and from 23.521 to 39.349, respectively. Meanwhile, performance ranges at their training (R2, RMSE, Bias) and testing (R2, RMSE, Bias, RPD) using FT-NIR are from 0.892 to 0.999, from 0.482 to 5.850%, from −0.035 to 1.017, from 0.886 to 0.998, from 0.670 to 6.108%, from −0.202 to 1.481, and from 2.958 to 21.421, respectively.
In closing, the prediction results demonstrated that the proposed architecture from the deep-learning method yielded superior regression performance for the FT-NIR and Micro-NIR to predict the level of adulterants (corn flour and tapioca starch) in coconut milk. While finding that the optimal deep-learning architecture is complex and computationally expensive, implementation and training are straightforward once found. Furthermore, developing deep-learning architectures and applying them are two different study matters that should not be confused. This study also indicated that deep learning for NIR spectroscopy data is less dependent on preprocessing than the classical chemometrics method and still can achieve excellent performance.