One-Dimensional Convolutional Neural Networks for Hyperspectral Analysis of Nitrogen in Plant Leaves

Featured Application: The proposed methodology is able to estimate the amount of nitrogen in plant leaves, using spectral information in the visible (Vis) and near infrared (NIR) ranges, obtaining a mean relative error below 1%. Thus, it will enable the development of portable devices to detect overuse of nitrogen fertilizers in the crops in a fast and non-destructive way. Although it has been tested in cucumber plants, the proposed method can be applied to other types of horticultural crops, repeating the training of the neural network when the new datasets of spectral data and measured nitrogen is available. Abstract: Accurately determining the nutritional status of plants can prevent many diseases caused by fertilizer disorders. Leaf analysis is one of the most used methods for this purpose. However, in order to get a more accurate result, disorders must be identiﬁed before symptoms appear. Therefore, this study aims to identify leaves with excessive nitrogen using one-dimensional convolutional neural networks (1D-CNN) on a dataset of spectral data using the Keras library. Seeds of cucumber were planted in several pots and, after growing the plants, they were divided into different classes of control (without excess nitrogen), N 30% (excess application of nitrogen fertilizer by 30%), N 60% (60% overdose), and N 90% (90% overdose). Hyperspectral data of the samples in the 400–1100 nm range were captured using a hyperspectral camera. The actual amount of nitrogen for each leaf was measured using the Kjeldahl method. Since there were statistically signiﬁcant differences between the classes, an individual prediction model was designed for each class based on the 1D-CNN algorithm. The main innovation of the present research resides in the application of separate prediction models for each class, and the design of the proposed 1D-CNN regression model. The results showed that the coefﬁcient of determination and the mean squared error for the classes N 30% , N 60% and N 90% were 0.962, 0.0005; 0.968, 0.0003; and 0.967, 0.0007, respectively. Therefore, the proposed method can be effectively used to detect over-application of nitrogen fertilizers in plants.


Introduction
Proper management of inputs, that is, the application of fertilizer in accordance with the nutritional needs of each plant, is necessary to achieve a healthy crop and optimal yield, which is a new concept in the so-called field of precision agriculture [1]. Among the nutrients needed by plants, nitrogen is one of the most important because it is a main Watchareeruetai et al. [14] used a CNN for studying deficiencies on plants considering different nutrients, such as Ca, Fe, K, Mg and N. A dataset consisting of 3000 leaf images was analyzed. The results indicated that the proposed method is superior to trained humans in the detection of nutrient deficiency. The CNN classifier had an accuracy of 94%. Espejo-García et al. [15] aimed to use fine-tuning, instead of ImageNet, in pre-trained CNN in order to improve the obtained performance. Experimental results showed that the overall performance can be increased by the proposed method. Some structures, such as Xception and Inception-Resnet, improved it by 0.51% and 1.89%, respectively. Comparing machine learning with DL, Sharma et al. [16] stated that DL has been analyzed and implemented in various applications and has shown remarkable results, thus they need to be explored more broadly, since they can also be useful in most fields. Liu et al. [17] categorized hyperspectral images using long short-term memory (LSTM). Specifically, for each pixel, they fed spectral values one by one in different LSTM channels to train spectral properties. Principal component analysis (PCA) was used first to extract the first components of a hyperspectral image, and then local patches were selected. The row vectors of each patch are then transferred to the LSTM to determine the spatial properties of the central pixel. Then, in the classifier step, the spectral and spatial properties of the pixels are fed into soft-max classifiers, to obtain two different results. A strategy for decision fusion was used to obtain more spatial-spectral results. Tian et al. [18] estimated soluble solids in apples using spectral data and DL. In their proposed model, the spectral data of apples were investigated and determined using a random frog algorithm; and DL was used to train and test the detection of geographical origin with spectral data as input. Partial least squares (PLS) were used to create individual calibration models, and then to estimate soluble solids. Competitive adaptive reweighted sampling (CARS) was used to select the optimal wavelengths. Compared to the individual source model, the proposed multi-source model obtained more accurate results for predicting soluble solid content of apples from multiple geographical origins, obtain an RP of 990 and RMSEP 0.274. Cai et al. [19] estimated soil nutrients also using spectroscopy and DL. The simulation results indicated that the proposed model was able to improve the efficiency of obtaining the features with high reliability. This solves the problem of traditional models.
Cucumber is a fruit that is rich in useful nutrients, containing some compounds and antioxidants, which may help improve, and even prevent, some diseases in humans [20]. Reliable diagnosis of the nutritional status of agricultural products is an important aspect of farm production, since both excess and deficiencies of nutrients can lead to damages and reduced yields. According to the literature, most of the previous research works used statistical methods, simple perceptron neural networks and DL neural networks with predefined structures, such as ImageNet and LSTM [21]. Although all these methods were successful on their own, more accurate methods are needed for everyday use.
Consequently, the present paper describes a new one-dimensional (1D) convolutional neural network to estimate the nitrogen content of cucumber leaves. The innovation of this paper resides in the application of nitrogen overdose at 3 different levels, in order to investigate whether it is possible to detect nitrogen-rich cucumber on site and in real-time. Hyperspectral analysis of fruits and plants has been a very active area of research in the literature, and particularly in our research group, as previously seen. The main novelty of the present work resides in the proposal of a new 1D-CNN architecture with 12 layers. It contains six convolutional layers and three max-pooling layers, finishing with a dense layer, which produces the final estimation of nitrogen content. Another important element is the addition of a dropout layer that, by randomly removing some weights of the previous layer, avoids problems of overfitting.

Materials and Methods
The constituent stages of the proposed methodology for non-destructive estimation of nitrogen in cucumber plants can be seen in Figure 1. The steps of this methodology are described in detail in the following subsections.

Materials and Methods
The constituent stages of the proposed methodology for non-destructive estimation of nitrogen in cucumber plants can be seen in Figure 1. The steps of this methodology are described in detail in the following subsections.

Cultivation and Collection of Samples to Extract Spectral Data
According to the stages of the methodology in Figure 1, the first step is to prepare the samples with standard nitrogen fertilizer and with nitrogen over-dose. Hence, several seeds of cucumber of Super Arshiya'F1 variety were planted into 40 pots, and they were grown under laboratory conditions. All pots received the same inputs until the leaves grew; then the pots were divided into 4 groups of 10 pots. The first group with standard nitrogen was considered as the control treatment. A sample view of the pots in this category is shown in Figure 2. The second, third and fourth groups received nitrogen fertilizer overdoses by 30%, 60% and 90%, respectively. After 24 h of applying the treatments, sample leaves were obtained from each group and their spectral signatures were extracted by a hyperspectral camera. The process for obtaining the samples of the dataset was as follows. First, 10 leaves were picked from each category, making a total of 40 cucumber leaves. For each leaf, the spectral images were captured, as described in Section 2.3, giving 327 images per leaf (one for each spectral band considered). This makes a total of 13,080 images. Then, the leaves were transferred to laboratory for the measurement of the nitrogen content by chemical analysis, as presented in Section 2.4. After manually analyzing the images, 10 patches were selected for

Cultivation and Collection of Samples to Extract Spectral Data
According to the stages of the methodology in Figure 1, the first step is to prepare the samples with standard nitrogen fertilizer and with nitrogen over-dose. Hence, several seeds of cucumber of Super Arshiya'F1 variety were planted into 40 pots, and they were grown under laboratory conditions. All pots received the same inputs until the leaves grew; then the pots were divided into 4 groups of 10 pots. The first group with standard nitrogen was considered as the control treatment. A sample view of the pots in this category is shown in Figure 2. The second, third and fourth groups received nitrogen fertilizer overdoses by 30%, 60% and 90%, respectively.

Materials and Methods
The constituent stages of the proposed methodology for non-destructive estimation of nitrogen in cucumber plants can be seen in Figure 1. The steps of this methodology are described in detail in the following subsections.

Cultivation and Collection of Samples to Extract Spectral Data
According to the stages of the methodology in Figure 1, the first step is to prepare the samples with standard nitrogen fertilizer and with nitrogen over-dose. Hence, several seeds of cucumber of Super Arshiya'F1 variety were planted into 40 pots, and they were grown under laboratory conditions. All pots received the same inputs until the leaves grew; then the pots were divided into 4 groups of 10 pots. The first group with standard nitrogen was considered as the control treatment. A sample view of the pots in this category is shown in Figure 2. The second, third and fourth groups received nitrogen fertilizer overdoses by 30%, 60% and 90%, respectively. After 24 h of applying the treatments, sample leaves were obtained from each group and their spectral signatures were extracted by a hyperspectral camera. The process for obtaining the samples of the dataset was as follows. First, 10 leaves were picked from each category, making a total of 40 cucumber leaves. For each leaf, the spectral images were captured, as described in Section 2.3, giving 327 images per leaf (one for each spectral band considered). This makes a total of 13,080 images. Then, the leaves were transferred to laboratory for the measurement of the nitrogen content by chemical analysis, as presented in Section 2.4. After manually analyzing the images, 10 patches were selected for After 24 h of applying the treatments, sample leaves were obtained from each group and their spectral signatures were extracted by a hyperspectral camera. The process for obtaining the samples of the dataset was as follows. First, 10 leaves were picked from each category, making a total of 40 cucumber leaves. For each leaf, the spectral images were captured, as described in Section 2.3, giving 327 images per leaf (one for each spectral band considered). This makes a total of 13,080 images. Then, the leaves were transferred to laboratory for the measurement of the nitrogen content by chemical analysis, as presented in Section 2.4. After manually analyzing the images, 10 patches were selected for each leaf in the 327 images, calculating the mean of each patch. Thus, 100 samples are available for each category, which consist of a tuple of 327 values, and the corresponding nitrogen content measured.
Although the process for obtaining this data is expensive (in terms of volume of images and laboratory analysis), the number of 100 samples per class is clearly insufficient for deep neural network analysis. For this reason, data augmentation was performed to obtain more synthetic samples. This was carried out by random weighted averaging of the real samples (i.e., computing a weighted average of two random tuples, and the same weighted average of the corresponding measured nitrogen content). Using this procedure, 900 synthetic samples per class were produced, totaling 1000 samples per class in the dataset (i.e., 4000 samples in the four classes).

Extraction of Spectral Data from Cucumber Leaves and Statistical Analysis
For the extraction of the spectral information of the cucumber leaves, a hyperspectral camera (Noor Iman Tajhiz Co., Kashan, Isfahan, Iran) at the range of 400-1100 nm was used. More specifically, 327 spectral images uniformly distributed in this range were obtained for each leaf, with a wavelength increment of 3.37 nm. To block out ambient light, the camera was located in an illumination chamber and exposed to two 10-watt tungsten halogen lamps (SLI-CAL, StellarNet Inc., Tampa, FL, USA). The spectral information in the Vis-NIR region was extracted and stored in a laptop. This laptop was a model DELL (DELL Co., Round Rock, TX, USA) with Intel Core i5, 2430 M at 2.40 GHz, 4 GB of RAM, and Windows 10. The original wavelengths were corrected by two methods: multiplicative scatter correction (MSC) in ParLeS software (Raphael Viscarra Rossel, Curtin University, Bentley, Australia), shown in Figure 3; and the smoothing operation by the Savitzky-Golay (SG) filtering algorithm [22]. each leaf in the 327 images, calculating the mean of each patch. Thus, 100 samples are available for each category, which consist of a tuple of 327 values, and the corresponding nitrogen content measured.
Although the process for obtaining this data is expensive (in terms of volume of images and laboratory analysis), the number of 100 samples per class is clearly insufficient for deep neural network analysis. For this reason, data augmentation was performed to obtain more synthetic samples. This was carried out by random weighted averaging of the real samples (i.e., computing a weighted average of two random tuples, and the same weighted average of the corresponding measured nitrogen content). Using this procedure, 900 synthetic samples per class were produced, totaling 1000 samples per class in the dataset (i.e., 4000 samples in the four classes).

Extraction of Spectral Data from Cucumber Leaves and Statistical Analysis
For the extraction of the spectral information of the cucumber leaves, a hyperspectral camera (Noor Iman Tajhiz Co., Kashan, Isfahan, Iran) at the range of 400-1100 nm was used. More specifically, 327 spectral images uniformly distributed in this range were obtained for each leaf, with a wavelength increment of 3.37 nm. To block out ambient light, the camera was located in an illumination chamber and exposed to two 10-watt tungsten halogen lamps (SLI-CAL, StellarNet Inc., Tampa, FL, USA). The spectral information in the Vis-NIR region was extracted and stored in a laptop. This laptop was a model DELL (DELL Co., Round Rock, TX, USA) with Intel Core i5, 2430 M at 2.40 GHz, 4 GB of RAM, and Windows 10. The original wavelengths were corrected by two methods: multiplicative scatter correction (MSC) in ParLeS software (Raphael Viscarra Rossel, Curtin University, Bentley, Australia), shown in Figure 3; and the smoothing operation by the Savitzky-Golay (SG) filtering algorithm [22]. Before creating the regression models, we checked whether the spectral information of the treatments are statistically significantly different or not. The spectral data of the four treatments (control, N30%, N60% and N90%) were examined with two statistical tests: ANOVA test [23] and Duncan test [24]. The null hypothesis is that the mean values of the spectral wavelengths are the same for all the categories, and the alternative hypothesis is that they are different. That is, the input to the statistical tests are the 4000 samples, each of them a tuple of 327 values.
As it is well known, ANOVA test analyzes the null hypothesis for all the classes, while the Duncan test is a multiple comparison procedure between the pairs of classes. Before creating the regression models, we checked whether the spectral information of the treatments are statistically significantly different or not. The spectral data of the four treatments (control, N 30% , N 60% and N 90% ) were examined with two statistical tests: ANOVA test [23] and Duncan test [24]. The null hypothesis is that the mean values of the spectral wavelengths are the same for all the categories, and the alternative hypothesis is that they are different. That is, the input to the statistical tests are the 4000 samples, each of them a tuple of 327 values.
As it is well known, ANOVA test analyzes the null hypothesis for all the classes, while the Duncan test is a multiple comparison procedure between the pairs of classes. The results obtained for the ANOVA test are presented in Table 1, and those corresponding to Duncan test are contained in Table 2.
The results of these tests indicate that the differences between the classes are statistically significant, both in the ANOVA test (Table 1) and in the Duncan test (Table 2), with a high level of significance. That is, the spectral wavelengths that reflect the presence of Appl. Sci. 2021, 11, 11853 6 of 15 nitrogen are affected by the amount of nitrogen contained in the leaves. Thus, this justifies the development of different regression models for each class, since each of them produces distinct spectral features. Table 1. ANOVA analysis for spectral data of the four categories of treatments: control, N 30% , N 60% and N 90% . The rows indicate the sum of squares between and within the classes, the number of degrees of freedom, the means of the squares between and within classes, the F-score (mean square between classes divided by mean square within classes), and the p-value of significance for an alpha of 0.05.

Category
Sum These tests do not prove any evidence about the linearity or non-linearity between the spectral information and the nitrogen content of the leaves. They only indicate that each class produces different spectral data. This is the reason for creating separate regression models. Although we could create a single regression model for all the classes, the statistical tests involve that, since the spectral data are different, more precise results can be obtained with individual models for each category.

Extraction of Nitrogen in Cucumber Leaves Using Laboratory Destructive Method
The Kjeldahl method was used to measure the total nitrogen content on the leaves. This method [25] includes three steps of digestion, distillation and titration. First, the leaves were dried in an oven and then powdered. Then, they are digested with sulfuric acid, so the nitrogen in the sample could be converted to ammonium sulfate. The nitrogen in ammonium sulfate was released in the form of ammonia and converted to ammonium borate with boric acid and titrated using 1% normal sulfuric acid; and then, the total nitrogen content of the sample can be obtained by calculating the consumed acid, as indicated in Equation (1): where: vs: Volume consumed by the samples (mL), vb: Volume consumed by the control treatment (mL), N H2SO4 : Normality of sulfuric acid (equation/L), md: Dry weight of the sample (g).

Non-Destructive Estimation of Nitrogen Using Convolutional Neural Networks
Deep learning (DL) is a category of machine learning algorithms that uses multiple layers to extract high-level features from raw input [26]. These kind of networks use multiple layers of information processing, especially nonlinear information, to perform the conversion or extraction of supervised or unsupervised features, generally for the purpose of pattern analysis or recognition, classification and clustering. Most deep learning methods use neural network structures, which is why DL models are often referred to as deep neural networks. Among them, convolutional neural networks (CNN) are one of the most popular techniques, since they do not require manual feature extraction. CNNs learn to recognize features from the samples using a large number of hidden layers. Each hidden layer increases the complexity of the features analyzed. Moreover, CNN networks do not change the structure of the input and pay attention to the connection between neighboring values.
Until now, most of the research has been conducted by known CNN structures such as AlexNet, VGG Net, ZF Net, GoogLeNet and fully convolutional networks (FCN) [27]. But in this paper, we proposed our personalized structure. In fact, it has been developed by examining several personalized structures by trial and error. Finally, the structure with the highest prediction rate was selected and proposed in this work.
More specifically, the proposed structure of the convolutional neural network used in this study is shown in Figure 4. The input vector of the algorithm contains the spectral data of a sample, and the output is the actual amount of nitrogen. Since, in our case, the input data is a one-dimensional (1D) spectrum, we used 1D convolutions. As shown, the architecture is similar to the well-known funnel design, where the size of the input is progressively reduced while the number of features increases. This is carried out by a succession of 1D convolutions and max-pooling layers. All the convolutions use the rectified linear (ReLu) activation function. The first part of the network contains 3 consecutive convolutional layers, then there are three steps of max-pooling and convolution. The second of these steps also includes a dropout layer, a method which is commonly used to avoid overfitting, since it randomly removes some weights in the previous convolutions, thus making the system more robust. The dropout rate was set to 0.1. Finally, the values of the last layer are flattened (producing 20,160 values) and a dense layer is responsible for producing the final estimation of nitrogen content.

Non-Destructive Estimation of Nitrogen Using Convolutional Neural Networks
Deep learning (DL) is a category of machine learning algorithms that uses multiple layers to extract high-level features from raw input [26]. These kind of networks use multiple layers of information processing, especially nonlinear information, to perform the conversion or extraction of supervised or unsupervised features, generally for the purpose of pattern analysis or recognition, classification and clustering. Most deep learning methods use neural network structures, which is why DL models are often referred to as deep neural networks. Among them, convolutional neural networks (CNN) are one of the most popular techniques, since they do not require manual feature extraction. CNNs learn to recognize features from the samples using a large number of hidden layers. Each hidden layer increases the complexity of the features analyzed. Moreover, CNN networks do not change the structure of the input and pay attention to the connection between neighboring values.
Until now, most of the research has been conducted by known CNN structures such as AlexNet, VGG Net, ZF Net, GoogLeNet and fully convolutional networks (FCN) [27]. But in this paper, we proposed our personalized structure. In fact, it has been developed by examining several personalized structures by trial and error. Finally, the structure with the highest prediction rate was selected and proposed in this work.
More specifically, the proposed structure of the convolutional neural network used in this study is shown in Figure 4. The input vector of the algorithm contains the spectral data of a sample, and the output is the actual amount of nitrogen. Since, in our case, the input data is a one-dimensional (1D) spectrum, we used 1D convolutions. As shown, the architecture is similar to the well-known funnel design, where the size of the input is progressively reduced while the number of features increases. This is carried out by a succession of 1D convolutions and max-pooling layers. All the convolutions use the rectified linear (ReLu) activation function. The first part of the network contains 3 consecutive convolutional layers, then there are three steps of max-pooling and convolution. The second of these steps also includes a dropout layer, a method which is commonly used to avoid overfitting, since it randomly removes some weights in the previous convolutions, thus making the system more robust. The dropout rate was set to 0.1. Finally, the values of the last layer are flattened (producing 20,160 values) and a dense layer is responsible for producing the final estimation of nitrogen content. It should be noted that among the 1000 samples in each category, 70% were used as the training data, and the remaining 30% as the test data, both sets disjoint. Moreover, the synthetic samples of the augmentation process were obtained after partitioning the real samples, so there is no mix of the training and test samples. The number of total epochs, It should be noted that among the 1000 samples in each category, 70% were used as the training data, and the remaining 30% as the test data, both sets disjoint. Moreover, the synthetic samples of the augmentation process were obtained after partitioning the real samples, so there is no mix of the training and test samples. The number of total epochs, batch-size, verbose and validation split were 200, 12, 1 and 0.1, respectively. The exact parameters of the proposed 1D-CNN are presented in Table 3. Table 3. Hyperparameters of the 1D convolutional neural network (1D-CNN) structure proposed in this paper for the estimation of nitrogen amount in cucumber leaves. Output shape refers to the size × number of features.

Evaluation of the Performance of the Methods for the Estimation of Nitrogen in Cucumber
To evaluate the performance of the prediction model, the statistical parameters including score-variance (Var-score) [28], max error (MaxE) [29], mean absolute error (MAE) [30], mean squared error (MSE) [31], coefficient of determination (R 2 ) [32], median absolute error (MedAE) [33] and mean squared logarithmic error (MSLE) [34] were used. Consider that the expected outputs for a given variable are Y i , for i ranging in the number of samples, n, and consider that X i represents the corresponding estimated values. Then, the proposed performance measures are given in the following equations:

Results
In the following subsections, the results achieved by the regression models created for each treatment are presented. All of them are 1D-CNN networks that are specifically trained for the corresponding class.

Prediction Model for the Category with Excess of Nitrogen by 30%
Preprocessing is needed to reduce the noise in the spectral information, so different algorithms of correction and smoothing filters were examined. Table 4 contains the results of evaluating the performance of the prediction model for the category with nitrogen overdose by 30%, using different correction and smoothing algorithms. As described in Section 2, these filters are: multiplicative scatter correction (MSC), smoothing by the Savitzky-Golay (SG) filter. All the error measures can be observed to be very low, while R 2 is close to 1, indicating that the model is able to accurately estimate nitrogen content in this class. Figure 5 shows the regression plot between the amount of nitrogen measured by the Kjeldahl method, and the mean estimated nitrogen for 300 test data. The proximity of these two values indicates the ability of the proposed model. Figure 6 shows the train loss and validation loss diagrams for the nitrogen content during the training process. The maximum number of epochs was set to 200. In addition, we added a condition to early stop training if overfitting is detected (i.e., when training loss decreases but validation loss increases). As shown in Figure 6, it can be seen that after 17 epochs training stopped and the loss rate is very close to 0%.

Prediction Model for the Category with Excess of Nitrogen by 30%
Preprocessing is needed to reduce the noise in the spectral information, so different algorithms of correction and smoothing filters were examined. Table 4 contains the results of evaluating the performance of the prediction model for the category with nitrogen overdose by 30%, using different correction and smoothing algorithms. As described in Section 2, these filters are: multiplicative scatter correction (MSC), smoothing by the Savitzky-Golay (SG) filter. All the error measures can be observed to be very low, while R 2 is close to 1, indicating that the model is able to accurately estimate nitrogen content in this class. Figure 5 shows the regression plot between the amount of nitrogen measured by the Kjeldahl method, and the mean estimated nitrogen for 300 test data. The proximity of these two values indicates the ability of the proposed model. Figure 6 shows the train loss and validation loss diagrams for the nitrogen content during the training process. The maximum number of epochs was set to 200. In addition, we added a condition to early stop training if over-fitting is detected (i.e., when training loss decreases but validation loss increases). As shown in Figure 6, it can be seen that after 17 epochs training stopped and the loss rate is very close to 0%.   Figure 5. Regression plot of the measured and estimated nitrogen content for category N 30% .

Prediction Model for the Category with Excess of Nitrogen by 60%
The statistical criteria for evaluating the performance of the prediction model are given in Table 5 for the category N 60% . Again, it can be seen that the error measures are very close to 0, so the proposed model was successful for the prediction of nitrogen in the treatment of 60% overdose. Moreover, the results indicate the positive effect of the filtering algorithms.

Prediction Model for the Category with Excess of Nitrogen by 60%
The statistical criteria for evaluating the performance of the prediction model are given in Table 5 for the category N60%. Again, it can be seen that the error measures are very close to 0, so the proposed model was successful for the prediction of nitrogen in the treatment of 60% overdose. Moreover, the results indicate the positive effect of the filtering algorithms.  Figure 7 shows the regression plot of the measured and the estimated nitrogen content for this category, for the 300 test samples used. These two values are close to each other, indicating the ability of the proposed model. Figure 8 illustrates that after 43 epochs, the training process stopped due to over fitting. At that step, the training achieved the optimal result.   Figure 7 shows the regression plot of the measured and the estimated nitrogen content for this category, for the 300 test samples used. These two values are close to each other, indicating the ability of the proposed model. Figure 8 illustrates that after 43 epochs, the training process stopped due to over fitting. At that step, the training achieved the optimal result.

Prediction Model for the Category with Excess of Nitrogen by 90%
Regarding the treatment of 90% nitrogen overdose, the statistical criteria for evaluating the performance of the prediction model are given in Table 6, indicating the mean error measures and the variance score and coefficient of determination. Again, the measures achieved are very positive and indicate a great accuracy of the proposed model.

Prediction Model for the Category with Excess of Nitrogen by 90%
Regarding the treatment of 90% nitrogen overdose, the statistical criteria for evaluating the performance of the prediction model are given in Table 6, indicating the mean error measures and the variance score and coefficient of determination. Again, the measures achieved are very positive and indicate a great accuracy of the proposed model.  Figure 9 depicts the regression plot of the measured amount of nitrogen and the mean estimated values for 300 test data in this treatment. As indicated in Table 6, the prediction is done with a correlation coefficient of 0.967.  Figure 9 depicts the regression plot of the measured amount of nitrogen and the mean estimated values for 300 test data in this treatment. As indicated in Table 6, the prediction is done with a correlation coefficient of 0.967.   Figure 10 shows that after 41 epochs, the training process stopped due to the detection of the over fitting criterion. Both the train and validation loss present a similar behavior. Initially, there is a fast reduction during the first 5-6 epochs of training. Then, there is a more smooth and progressive reduction of both loss values, approximately until epoch 25. Finally, the system stabilizes and the process is stopped before the validation loss begins to increase (which would mean that overfitting is starting). Figure 9. Regression plot of the measured and estimated nitrogen content for category N90%. Figure 10 shows that after 41 epochs, the training process stopped due to the detection of the over fitting criterion. Both the train and validation loss present a similar behavior. Initially, there is a fast reduction during the first 5-6 epochs of training. Then, there is a more smooth and progressive reduction of both loss values, approximately until epoch 25. Finally, the system stabilizes and the process is stopped before the validation loss begins to increase (which would mean that overfitting is starting).

Discussion
In general, the results presented in the previous section prove that the nitrogen content of cucumber leaves can be estimated with a high accuracy using hyperspectral information in the Vis-NIR range and the proposed 1D-CNN regression models. The obtained mean errors are always below 1% with respect to the expected values for all the treatments. For example, in the N 30% class, the mean absolute error (MAE) is only 0.017 mg/g, while the nitrogen content ranges from 3-3.4 mg/g; this would represent a 0.56% relative error. The relative MAE for the N 60% and N 90% classes are 0.35% and 0.45%, respectively. This means that the proposed approach can be effectively used for the estimation of nitrogen in the leaves from the early stages, since the error is very small even for the 30% class.
This positive result can also be argued even for the worst cases of error, i.e., considering the maximum errors, MaxE. This error is 0.086 mg/g for N 30% class and 0.091 mg/g for N 60% class, which represent relative errors below 3%. In the N 90% class, this worst case has a very low error of 0.054 mg/g. Therefore, we can conclude that the method does not present cases of extremely high error. This is evidence of the robustness of the proposed approach against the most difficult cases.
Regarding the preprocessing techniques applied in the paper, MSC and SG, both steps prove to have positive effects on the obtained results. Observe that MSC is a scatter correction filter of the spectra, while SG is a smoothing operator. In all the treatments, the worst results are given by not applying any filter, which is evident for all the performance measures. For example, in the 60% class, the MSE is 0.0003 mg/g with the proposed filters, MSC + SG, while it rises to 0.0012 mg/g with no preprocessing filter. If we should select only one filter, the smoothing step, SG, is the method that offers the best results by itself in terms of the error. For example, in the 90% class, the MAE without any filter is 0.028 mg/g; with SG it descends to 0.025, but with MSG it is 0.027. This indicates that smoothing is more beneficial than scatter correction. This is also repeated for the determination coefficient, R 2 , which is consistently higher for SG. Moreover, the improvement from applying MSC + SG with respect to applying only SG is very reduced in most cases, with a small positive effect in the error measures.
Analyzing the evolution of the loss curves in Figures 6, 8 and 10, it can be seen that the proposed 1D-CNN has a very fast convergence for all the treatments. A fast reduction of the loss values is produced in the first five epochs. Then, the system enters in a gradual convergence phase, which is stopped before it could incur in over-fitting. No model required more than 43 epochs, making them relatively fast to be trained. The loss curves also indicate that the training process does not incur in over-fitting, which would appear as a reduction of the train loss and an increment in the validation loss. Finally, Table 7 presents a comparison of the results achieved by the proposed 1D-CNN models, compared with the results of other researchers for the non-destructive estimation of different properties of fruits, using R 2 criterion. Although these works have been selected among the most similar to our proposed method, this comparison has to be considered in context, since they refer to different types of fruits, different datasets, and distinct types of regression models. We have also included the results of a previous study from our group that deals with the estimation of nitrogen in tomato leaves, using different types of classifiers [35]. Overall, the results of the present study showed that using 1D-CNN based on spectral data, it is possible to estimate the excess nitrogen more accurately, achieving results that are in the state of the art. Comparing the obtained results with the accuracy reported in [35], it can be seen that the R 2 are significantly better. These methods include partial least squares regression (PLSR) which is a statistical-based method, and hybrid approach of classical neural networks and the differential evolution algorithm (ANN-DE), and a method similar to the proposed in this paper using convolutional neural networks (CNN). In all of them, a unique model is trained for all the nitrogen categories, producing poor regression results always below 0.8 in R 2 . Instead, the approach of creating separate models for each class is able to produce better results above 0.96. In fact, the approach of training different models for the classes was also analyzed in [35], although the R 2 achieved ranged from 0.925 to 0.968.
On the other hand, a weak point of the proposed method is that it requires a previous classification of the leaf samples in the corresponding treatment class. Given a new unknown sample, there are three different models that could be applied on it. Thus, a classifier would be required to obtain the class before applying the regression network. This classification problem has been previously studied in our group, finding that it is possible to achieve it with a classification accuracy above 96% [10]. This would complete a system based on two steps: first, classification of the sample; and then, regression of the nitrogen content using the specific model. The results discussed in this section indicate that this approach is able to produce better results than training a single regression model for all the classes.

Conclusions
Scientific management of production and consumption of fertilizers is necessary to achieve sustainable agriculture, and to improve food health by improving the optimal use of them, especially nitrogen-based fertilizers. In this paper, we have presented a new methodology for fast and accurate estimation of nitrogen content in cucumber leaves using spectroscopy analysis and convolutional neural networks (CNN).
Currently, CNNs are one of the most popular methods of machine learning, since they do not require manual feature extraction. Instead, automatic feature extraction makes deep learning models very accurate for computer vision tasks, such as regression models. Therefore, we have presented a new structure of one-dimensional CNN (1D-CNN) to estimate nitrogen content. Different levels of nitrogen fertilizer overdose by 30%, 60% and 90% were added to the cucumbers, and then prediction models were trained for each treatment. The experimental results have shown that the proposed 1D-CNN is able to estimate very accurately the nitrogen content even in the early stages, achieving, for example, for the 30% treatment, a maximum error of 0.086 mg/g and an R 2 of 0.962. For the 60% and 90% classes, the R 2 are 0.968 and 0.967, respectively. The experiments also showed that the applied preprocessing filters, multiplicative scatter correction, Savitzky-Golay, had a positive effect on the accuracy. Thus, this approach can be used to create a fast and non-destructive method to prevent excessive use of fertilizers.

Data Availability Statement:
The data presented in this study are available upon reasonable request to the corresponding authors.