Evaluation of a One-Dimensional Convolution Neural Network for Chlorophyll Content Estimation Using a Compact Spectrometer

: Leaf chlorophyll content is used as a major indicator of plant stress and growth, and hyperspectral remote sensing is frequently used to monitor the chlorophyll content. Hyperspectral reﬂectance has been used to evaluate vegetation properties such as pigment content, plant structure and physiological features using portable spectroradiometers. However, the prices of these devices have not yet decreased to consumer-affordable levels, which prevents widespread use. In this study, a system based on a cost-effective ﬁngertip-sized spectrometer (Colorcompass-LF, a total price for the proposed solution was approximately 1600 USD) was evaluated for its ability to estimate the chlorophyll contents of radish and wasabi leaves and was compared with the Analytical Spectral Devices FieldSpec4. The chlorophyll contents per leaf area (cm 2 ) of radish were generally higher than those of wasabi and ranged from 42.20 to 94.39 µ g/cm 2 and 11.39 to 40.40 µ g/cm 2 for radish and wasabi, respectively. The chlorophyll content was estimated using regression models based on a one-dimensional convolutional neural network (1D-CNN) that was generated after the original reﬂectance from the spectrometer measurements was de-noised. The results from an independent validation dataset conﬁrmed the good performance of the Colorcompass-LF after spectral correction using a second-degree polynomial, and very similar estimation accuracies were obtained for the measurements from the FieldSpec4. The coefﬁcients of determination of the regression models based on 1D-CNN were almost same (with R 2 = 0.94) and the ratios of performance to deviation based on reﬂectance after spectral correction using a second-degree polynomial for the Colorcompass-LF and the FieldSpec4 were 4.31 and 4.33, respectively.


Introduction
Chlorophyll is one of the primary pigments involved in photosynthesis, which takes place in chloroplasts containing the chlorophyll. Therefore, chlorophyll content is related to photosynthetic capacity [1]. Chlorophyll content has also been used to evaluate crop status, such as plant physiological activity, to ensure high yield [2,3] and other aspects of crop management. Portable chlorophyll content meters, such as the SPAD-02 Leaf Chlorophyll Meter (Konica Minolta Inc.), have been used to measure in-situ leaf chlorophyll content. However, the leaf dry weight and thickness often make the results of such meters ambiguous [4,5], and the use of these devices is restricted. Alternative techniques based on hyperspectral remote reflectance using portable spectroradiometers, such as the Ocean Optics hyperspectral visible and near-infrared (Vis-NIR) spectroradiometer [6,7] and the Analytical Spectral Devices (ASD) FieldSpec series [8][9][10], have been proposed. Reflectance in the blue (420-470 nm) and red (640-680 nm) wavelength ranges depends on the leaf pigment, especially that due to chlorophyll, and a peak in the green region (520-580 nm) indicates a high chlorophyll content [11]. Based on these features, vegetation indices, i.e., the normalised difference [12][13][14][15][16], modified normalised difference [17], simple difference [18,19], simple ratio [12,13,[20][21][22][23][24][25][26][27][28][29][30][31][32][33][34] or an integration of such measures have been widely used to characterise vegetation [35], and a number of vegetation indices have been developed to evaluate the chlorophyll content. In addition, the numerical inversion of radiative transfer models has been proposed to estimate the chlorophyll content from hyperspectral reflectance data acquired by FieldSpec spectrometers [36,37]. However, the prices of spectrometers have not yet decreased to consumer-affordable levels, and this is the chief obstacle to their practical use. Consequently, the development of a low-cost hyperspectral remote sensing system would prove useful [38]. Recently, highly sensitive, cheap, and fingertip-sized spectrometers, such as the C12880MA-10 (Hamamatsu Photonics), have been released, and their potential for estimating chlorophyll content should be evaluated. In this study, reflectance measurements were obtained from two spectrometers, the Colorcompass-LF, which is based on the C12880MA-10, and the FieldSpec4. Chlorophyll estimates were obtained based on the reflectance measurements, and then the results from these spectrometers were compared.
Various factors, such as the signal-to-noise ratio of sensors, obscure reflectance data and therefore reduce the measurement accuracy [39]. Furthermore, vegetation indices based on measurements from spectrometers whose full width at half maximum (FWHM) is very precise are not always applicable to data from spectrometers with low FWHM values. The pre-processing of original reflectance data is effective for noise removal and for correcting the slope or base shift of the spectra, thereby producing accurate reflectance data for the evaluation of vegetation properties such as chlorophyll content. Pre-processing techniques have been widely applied as an essential step to remove noise in original reflectance data [40,41]. For example, de-trending (DT) is an effective pre-processing technique used to eliminate the effect of the additive interference of scattered light from particles [42]. Earlier studies have identified DT as the best pre-processing technique to estimate various properties of wasabi and tea leaves from reflectance data obtained using a Fieldspec4 spectroradiometer and leaf clippings [43][44][45]. Standard normal variate (SNV) transformation is effective for reducing the noise or baseline shift in raw reflectance data caused by light scattering [46]. SNV transformation was the most common pre-processing technique applied in earlier studies [47,48]. DT and SNV have also been compared with respect to their ability to estimate chlorophyll content.
In addition to the de-noising of original spectra, algorithm choice is one of the important processes required to improve the estimation accuracy of reflectance data. In recent years, deep learning-based algorithms have been successful in effectively expressing complex relationships, and their strong performance in the evaluation of vegetation properties has been reported [49]. Furthermore, deep learning has become increasingly prevalent following the rapid development of big data and computing power in the past few years [50,51]. One-dimensional convolutional neural network (1D-CNN) is one of the most effective architectures based on deep learning and has been used to evaluate soil properties using Vis-NIR reflectance [49,52]. Deep belief nets (DBNs) also have a probabilistic generative architecture composed of multiple layers of stochastic latent variables [53] and have performed well in hyperspectral remote sensing [54,55]. In general, high-specification computers are required to generate regression models based on deep learning algorithms. Google Colaboratory is a free online cloud-based Jupyter notebook environment that allows the generation of regression models based on graphics processing units. This server was used to generate regression models based on 1D-CNN for our proposed method of low-cost field-scale monitoring.
The specific sub-objectives of this research are (1) to compare the chlorophyll estimation accuracies based on reflectance from the Colorcompass-LF and Fieldspec4, (2) to determine the best pre-processing techniques for original reflectance data and (3) to compare regression models based on 1D-CNN with those based on a DBN.

Measurements and Datasets
Two Brassicaceae species were examined in this study: radish (Raphanus sativus), which is normally cultivated in agricultural fields and has a high chlorophyll content, and wasabi (Eutrema japonicum), which is normally grown in hydroponic culture and has a relatively low chlorophyll content.
The radish plants were cultivated at a within-row distance of 60 cm and an inter-row spacing of 90 cm in a field at Shizuoka University, Japan. Basal fertiliser (6 kg of N, P and K) was applied per 10 a, in addition to 120 kg of silicate fertiliser (The Sangyo Shinko Co. Ltd., Tokyo, Japan) and 3.6 g of boric acid. The experiment included a control test without slag (control) and with slag fertilizer treatment (slag) that contained SK calcium silicate (NJ Eco Service, Kitakyushu, Japan), of which the soluble silicic acid content was 32%. Seeding was conducted on 23 October 2020, and two additional supplementary fertiliser applications were performed, consisting of 4.8 kg of N, P and K per 10 a, on 7 and 25 November. A total of 144 leaves (72 leaves per treatment) were measured for reflectance and chlorophyll content determination on 2 and 3 March 2021.
One-year-old wasabi mericlone seedlings were cultivated individually in Wagner pots (1/5000 a) containing 3 L of tap water (adjusted to a pH of 6.0 using HCl and NaOH) and were continuously aerated from 28 January 2021. After one week, slightly modified solutions of 0.1 × Hoagland solution [56] were applied stepwise for one week at a strength of 1/100 and 1/10 to adapt the plants to the hydroponic system under standard nutrient solution conditions. Hoagland solution is one of the most widely used solutions for growing plants, containing 0. 25 O. Sulphur has been reported to be important in improving the allyl isothiocyanate concentration and yield, which determine plant pungency, and sulphur is frequently added to nitrogen fertiliser [57]. Thus, sulphur was added at four concentrations: the standard concentration (control, 0.58 mM SO 4 2− ), zero sulphur (0 × S), half the standard concentration (0.5 × S), and twice the standard sulphur concentration (2 × S). Different levels of nitrogen (0 × N and 2 × N), potassium (0 × K and 2 × K), and phosphorus (0 × P and 2 × P) were added except for the control sample. A total of 100 expanding wasabi leaves (10 leaves per treatment) were sampled from the top of the plants on 16 March 2021.
For quantifying chlorophyll contents, detached leaves were used. However, the reflectance was measured immediately after detaching. Reflectance data were measured using two spectrometers with a plant probe consisting of a halogen light source and a leaf clip ( Figure 1). The first spectrometer was the Colorcompass-LF, which is composed of a complementary metal-oxide semiconductor (CMOS) sensor (C12880MA-10, Hamamatsu Photonics, Hamamatsu, Japan) and a shape memory alloy (SMA) fibre patch cable (M25L05, Thorlabs, NJ, USA) with a 0.22 numerical aperture. The spectral resolution was resampled in 5 nm bands across the entire wavelength domain from 400 to 850 nm. The second spectrometer was the FieldSpec4 (Malvern Panalytical, Almelo, The Netherlands), which is composed of three detectors (visible and near-infrared [VNIR] and shortwave infrared [SWIR 1 and SWIR 2]), and the spectral drift was measured at two wavelengths (1000 nm and 1800 nm) due to inherent variations in detector sensitivities. To minimize this inconsistency, the splice correction function of ViewSpec Pro Software (Malvern Panalytical, Almelo, The Netherlands) was applied [58]. It is well known that the leaf chlorophyll content mainly affects reflectance in the 400-780 nm region [59], and the entire wavelength domain of the Colorcompass-LF spans the region from 340 nm to 850 nm. To avoid redundant analyses, reflectance values of wavelengths longer than 850 nm were removed before analysis. Leaf discs were collected after the reflectance measurements were completed, and the absorbance of dimethyl-formamide extracts was measured using a dual-beam scanning ultraviolet-visible spectrophotometer (UV-1900, Shimadzu, Kyoto, Japan). Wellburn's method [60] was applied to quantify the chlorophyll content based on absorption. N-N Dimethylformamide was used to prepare extracts from which chlorophyll-a (Chl-a) and b (Chl-b) contents (in µg mL −1 ) were calculated according to the following Equations (1 to 3) with the chlorophyll unit converted to µg/cm 2 using the leaf disc area.
Chl-a (µg mL −1 ) = 12.00 × (A 663. 8 where A is the absorbance and the subscripts are the wavelengths (in nm). A stratified random sampling approach, which is a method of sampling that involves the division of all measurements into smaller sub-groups (strata), was applied. The strata were based on treatments. The measurements were divided into two groups: a training dataset (75%) and test dataset (25%) following a previous study [49], and this approach was repeated one hundred times to ensure robust results.

Pre-Processing of the Raw Reflectance Data
Pre-processing is an essential step to remove noise from original reflectance data and to improve regression models. In this study, spectra after de-trending (DT) and standard normal variate (SNV) correction were evaluated in addition to the original reflectance (OR). DT and SNV were implemented using the "prospectr" package [61] in R version 4.0.2 [62].

De-Trending (DT)
In DT, the baseline is assumed to be a second-degree polynomial function of the wavelength and is subtracted from the spectrum. This technique has also been used to account for the variation in baseline shifts and curvilinearity by fitting a second-degree polynomial through each spectrum [42].

Standard Normal Variate (SNV)
SNV is effective in reducing multiplicative effects of scattering and particle size and is able to correct multiple scattering noise caused by the surface structure of leaves. This is mathematically expressed as follows: is the reflectance value after SNV, x i,j is the corresponding original reflectance value of variable j at wavelength i, x i is the mean of spectrum i, and p is the number of variables or wavelengths in the spectrum [63].

Model Development
CNN has been applied to automatically detect features of interest from the given data, and 1D-CNN can provide accurate results for 1D data [52]; 1D-CNN has an input layer, hidden layers (convolutional, pooling, fully connected and normalization) and an output layer. Convolution was applied to the reflectance data to extract a feature map using a convolution filter, and then each unit in the convolutional layer was connected to local features in the feature map. After the convolution operation, a pooling layer was used for the dimensional reduction of the feature map, which effectively reduces the computational cost and minimises the overfitting of the network while preserving important information. In this study, the max-pooling technique and ReLU were applied. It was reported that 1D-CNN was effective to estimate the concentrations of the major and minor pigments from the reflectance and absorption coefficient spectral inputs [64]. In this study, the low and high of the chlorophyll-content samples were included and then this feature was effective for generating robust regression models. The architecture was composed of 10 hidden layers that included four convolutional layers, four max-pooling layers, and two fully connected layers; two dropout rates, 0.4 and 0.2, were used following previous studies [47,52]. The regression models based on 1D-CNN were generated using Google Colaboratory.

Deep Belief Nets (DBNs)
DBNs consist of multi-layer unsupervised restricted Boltzmann machines and produce an optimum model in comparison to a model based on random weights for the weight initialisation of a deep neural network [65]. DBNs can be effectively used to perform layerby-layer pre-training intended to initialise the training of a backpropagation algorithm [66]. DBNs have been applied to extract vegetation properties, such as quality (chlorophylla content) and stress (chlorophyll-a: b), from hyperspectral data for improved tea tree management, and some pre-processing could be reduced [54,67]. The initial configurations were the learning rate (0.1), the maximum iteration number of the pre-training dataset (100), the learning rate of the pre-training dataset (0.01), the maximum iteration number of the training dataset (100), and the batch data size (10) following previous studies [67,68]. DBN regression was implemented using the "darch" package in R version 4.0.6 [69].

Statistical Criteria
The model performance was evaluated using the ratio of performance to deviation (RPD, Equation (5)  RPD directly compares the index performance of different datasets and is used especially to examine robustness across different datasets. In addition to RPD, the root mean square error (RMSE, Equation (6)) and coefficient of determination (R 2 , Equation (7)) were calculated using: where SD is the standard deviation of the measurements, SEP is the standard error prediction, n is the number of samples, y i is the real value,ŷ i is the estimated value, and y is the mean of the measurements. Chang et al. [71] claimed that Category B can be improved by using different calibration strategies, but properties in Category C may not be reliably predicted.
The sensitivity of spectral wavelengths was evaluated using the variance principle [52,72]. For wavelength i (nm), the sensitivity S i was calculated as follows: where Var is the variation, f () is the prediction of spectra due to the variation in wavelength i (nm) with other wavelengths held constant at their mean values, f (X) is the estimated value based on the mean reflectance, and Y represents the measured chlorophyll content. Following the calculation of S i , the scores were converted to percentages.

Chlorophyll Contents in Each Treatment
The chlorophyll contents per leaf area (cm 2 ) of radish were generally higher than those of wasabi and ranged from 42.20 to 94.39 µg/cm 2 and 11.39 to 40.40 µg/cm 2 for radish and wasabi, respectively (Table 1). No significant difference was observed between the two treatments for radish (p > 0.1, Tukey-Kramer test), and slag fertilizer had no effect on the chlorophyll content of the radish leaves. Except the different phosphorus fertilizer levels, the increase in the fertilizer concentrations effectively increased the chlorophyll contents.

Spectral Reflectance
The mean reflectance of each crop measured by the spectrometers is shown in Figure 2. The decrease in reflectance at 825 nm was due to the low sensitivity of the C12880MA-10, which is the basis of the Colorcompass-LF. Due to high chlorophyll contents, the reflectance values in the green region were lower than those for wasabi, and high negative correlation coefficients (p < 0.001) were confirmed at 525 nm (r = −0.813 and −0.795 for the Colorcompass-LF and FieldSpec4, respectively) for wasabi. Significant negative correlations (p < 0.01) were also observed for radish; the absolute values were lower than those of wasabi (r = −0.205 and −0.391 for the Colorcompass-LF and FieldSpec4, respectively). Furthermore, strong negative correlation coefficients were observed for the red edge inflection point (REIP), and the values were −0.867 and −0.802 for the Colorcompass-LF and FieldSpec4, respectively. In contrast to the reflectance in the green region, a strong negative correlation was observed for the measurements from the FieldSpec4 (r = −0.670, p < 0.001), while a lower correlation was observed for the measurements from the Colorcompass-LF (r = −0.326, p < 0.01).

Accuracy Assessment
The evaluation results of each algorithm and pre-processing technique for measurements from both the Colorcompass-LF and FieldSpec4 are presented in Table 2. Although the OR of both spectrometers was acceptable for estimating the chlorophyll content (RPD

Accuracy Assessment
The evaluation results of each algorithm and pre-processing technique for measurements from both the Colorcompass-LF and FieldSpec4 are presented in Table 2. Although the OR of both spectrometers was acceptable for estimating the chlorophyll content (RPD > 1.4), both pre-processing techniques effectively improved the estimation accuracies and fitting of the regression models. DT was effective, and the regression models were categorised as 'A' (RPD > 2.0). The best combination was 1D-CNN and DT for both spectrometers, and the Colorcompass-LF had almost the same performance (RPD values) for chlorophyll content estimation (4.31 ± 0.40 µg/cm 2 vs. 4.33 ± 0.40 µg/cm 2 ).  Figure 3 shows the relationships between measured and estimated values when 1D-CNN was applied. The pre-processing techniques effectively reduced the standard deviation of the estimation errors. In particular, the standard deviations of the RMSEs decreased after DT, and these values were 0.49 and 0.48 µg/cm 2 for the Colorcompass-LF and FieldSpec4 measurements, respectively.

Sensitivity Analysis
Although the peaks of the importance were obscure when OR was applied, the importance of REIP for chlorophyll content estimation increased after the pre-processing techniques were applied when 1D-CNN was used (Figure 4a,c), and the most important wavelengths for DT and SNV were 710 nm and 715 nm, respectively, for Colorcompass-LF, and 720 nm and 705 nm, respectively, for FieldSpec4. The importance of the green peak was also confirmed for the FieldSpec4 measurements. On the contrary, this was not observed for DBN (Figure 4b,d).

Spectrometer Comparison
When OR was applied, the estimation results from the Colorcompass-LF measurements were superior (60-fold higher) to those from the FieldSpec4. The typical FWHM wavelength of the C12880MA-10, which is the basis of the Colorcompass-LF, is 12 nm, while the FieldSpec4 has a more precise FWHM (its spectral resolution is 3 nm). Furthermore, the relative sensitivity of the C12880MA-10 is less than 0.5 at 700 nm [73]. To reduce the influence of the low sensitivity of this spectrometer, a plant probe with a halogen light source and a leaf clip with replaceable white and black background standards was developed and used in this study. FieldSpec4 is a commercial plant probe that deteriorates over time. However, pre-processing techniques were required to improve the estimation accuracies of both sensors ( Table 2).
The best sensor, algorithm, and pre-processing technique combinations after 100 repetitions based on the RPD value are listed in Table 3. The estimation results from the FieldSpec4 measurements were superior (63-fold higher) to those from the Colorcompass-LF. However, the combination of 1D-CNN and DT effectively improved the chlorophyll content estimation accuracy, and the RPD values calculated from the estimation values reached 3.37-5.38 and 3.46-5.28 for the Colorcompass-LF and FieldSpec4, respectively. Therefore, it is expected that almost the same estimation accuracies can be obtained from the Colorcompass-LF measurements when 1D-CNN and DT are applied.

Optimal Machine Learning Algorithms
After 100 repetitions, 1D-CNN was included in the best combination (Table 3), and the accuracy of 1D-CNN was generally superior to that of DBN for each measurement from the sensors, although DBN had higher accuracies (6-, 2-, and 17-fold for the Colorcompass-LF original spectra, Colorcompass-LF SNV spectra, and FieldSpec4 original spectra, respectively). The minimum RPD values were 1.59, 2.87, and 1.41 for these combinations, and all estimation results were acceptable when 1D-CNN was applied. The strong performance of 1D-CNN-based regression models has been shown for soil property prediction using Vis-NIR spectral data [47,52,72], and the advantage of this algorithm was confirmed for leaf chlorophyll estimation from spectral reflectance between 400 nm and 850 nm. The important wavebands of the 1D-CNN model were also evaluated, and it was confirmed that REIP played the most important role in chlorophyll content estimation for the preprocessed spectra. The high sensitivities of the green peak have also been reported for the chlorophyll contents, and some vegetation indices based on the green peak have been proposed [22,23,[74][75][76]. However, the importance of the wavelength around the green peak was low (less than 5%) for the pre-processed spectra. It has been reported that this stress moves the green peak position toward long wavelengths [77], and the nutrient content in the Wagner pots may have influenced the water status of the cultivated wasabi plants. However, the REIPs of the mean spectra ranged from 710 nm to 715 nm (for both spectrometers) while the green peak ranged from 520 nm to 530 nm for the Colorcompass-LF measurements and was 520 nm for the FieldSpec4 measurements. As a result, there were no large shifts for the two bands. It has been reported that anthocyanin induction is strongly influenced by a low nitrogen concentration, and the absorption peak of anthocyanin corresponds to the green peak region, but there is no influence on the red edge region [78]. Therefore, REIP had relatively high sensitivities in the regression models. In future measurements, assessments of the influence of anthocyanin contents should be considered. The tendencies of important wavelengths were obscure for both spectrometer measurements when DBN was applied, as observed in results of previous studies, and this tendency is important for processing the spectra, including noise reduction [43,67]. Indeed, the application of the pre-processing techniques was less effective (Table 2); however, the performances were lower than those of the 1D-CNN-based regression models, even for the original reflectance.

Conclusions
In this study, hyperspectral data were acquired using a low-cost complementary metaloxide semiconductor (CMOS) sensor, Colorcompass-LF, and an Analytical Spectral Devices (ASD) FieldSpec4 to evaluate the performance of the Colorcompass-LF.
De-trending based on a second-degree polynomial effectively removed noise from the Colorcompass-LF and FieldSpec4 measurements, and the relative percent difference values reached 3-4 when one-dimensional convolutional neural network-based regression models were applied. As a result, the low-cost reflectance measurement system (Colorcompass-LF) estimates the chlorophyll content with almost the same accuracy as the high-specification spectrometer (ASD FieldSpec4). The information provided by the Colorcompass-LF can be used for more suitable nutrient management, facilitating quality control and plant maintenance for less-experienced farmers with low-cost field-scale monitoring.

Data Availability Statement:
The data that support the findings of this study are available on request from the corresponding author.