agronomy Rapid Detection of Urea Fertilizer Effects on VOC Emissions from Cucumber Fruits Using a MOS E-Nose Sensor Array

: The widespread use of nitrogen chemical fertilizers in modern agricultural practices has raised concerns over hazardous accumulations of nitrogen-based compounds in crop foods and in agricultural soils due to nitrogen overfertilization. Many vegetables accumulate and retain large amounts of nitrites and nitrates due to repeated nitrogen applications or excess use of nitrogen fertilizers. Consequently, the consumption of high-nitrate crop foods may cause health risks to humans. The effects of varying urea–nitrogen fertilizer application rates on VOC emissions from cucumber fruits were investigated using an experimental MOS electronic-nose (e-nose) device based on differences in sensor-array responses to volatile emissions from fruits, recorded following different urea fertilizer treatments. Urea fertilizer was applied to cucumber plants at treatment rates equivalent to 0, 100, 200, 300, and 400 kg/ha. Cucumber fruits were then harvested twice, 4 and 5 months after seed planting, and evaluated for VOC emissions using an e-nose technology to assess differences in smellprint signatures associated with different urea application rates. The electrical signals from the e-nose sensor array data outputs were subjected to four aroma classiﬁcation methods, including: linear and quadratic discriminant analysis (LDA-QDA), support vector machines (SVM), and artiﬁcial neural networks (ANN). The results suggest that combining the MOS e-nose technology with QDA is a promising method for rapidly monitoring urea fertilizer application rates applied to cucumber plants based on changes in VOC emissions from cucumber fruits. This new monitoring tool could be useful in adjusting future urea fertilizer application rates to help prevent nitrogen overfertilization.


Introduction
Fresh market produce, such as raw fruits and vegetables, are popular nutritional foods, because they are rich in important vitamins, minerals, antioxidants, and nutrients that provide positive benefits to human health [1,2]. Cucumber (Cucumis sativus L.), a member of the Cucurbitaceae plant family, is an economically important crop cultivated throughout the world. Cucumbers are widely consumed as edible fruits served in a variety of ways, including freshly sliced in salads, cooked, canned, processed (pickled or salted), or fermented to make pickles [2,3]. Various parts of cucumber plants also may be used in traditional medicine within some cultures [4,5]. Cucumbers are particularly popular in North American, Asian, and European countries, where their consumption is highest [6,7]. The demand for fresh produce is growing worldwide as consumers increasingly recognize the nutritional value and health benefits provided by these foods, but this situation may be contributing to higher rates of consumption of produce with excessive levels of nitrates due to the overapplication of nitrogen fertilizers [8,9].
Urea is one of the most common forms of nitrogen fertilizers applied to agricultural fields to enhance crop production. Using urea fertilizer as an already-reduced, readily available form of nitrogen for plant uptake has been shown through research to significantly improve crop yields. However, the total amount and rate of urea fertilizer applied (per hectare) to agricultural fields should be adjusted according to soil test analysis results, indicating current soil-nitrate levels, to optimize the nitrogen applied at quantities known to be most appropriate for the growth of specific crops [10,11]. In addition to soil pollution, chemical fertilizers may affect human health. Excessive use of organic and chemical nitrogen fertilizers often leads to the accumulation of nitrates in fruits and vegetables [12][13][14]. Leafy vegetables and fruits that contain higher levels of nitrates account for up to 85% of the dietary intake of nitrates in many world societies [15][16][17]. Nitrates consumed through produce may contribute to important human health risks and harmful effects associated with serious diseases [18][19][20].
A critical goal of the food industry is to ensure the safety and quality of foods through multiple stages of the crop production process. A variety of crop harvesting practices, testing methods, and procedures are employed to evaluate and verify the quality of consumable plant products sold in commercial markets. One noninvasive method for evaluating produce and fruit quality is to assess the aroma characteristics based on VOC emissions derived directly from samples of each crop. As a result of the complexity of aromas in most foods, it is difficult and expensive to effectively sample and evaluate the quality of plant products using conventional human smell and taste testers or chemical analysis techniques, such as using gas chromatography (GC) and mass spectroscopy (MS). This is due to the long analysis times required for chemical assessments and the large daily volume of produce that must be evaluated [21,22].
The development of advanced technologies borrowed from various applied science disciplines, including sensor architecture, electronics, biochemistry, and artificial intelligence, has made it possible to design practical electronic-sensing tools, e.g. electronic noses (e-noses) and electronic tongues (e-tongues), for measuring such produce quality factors as aroma, taste, and chemical constituents of various products in the food industry [23][24][25]. For development of these devices as effective tools for maintaining and monitoring food quality controls, they must be sufficiently sensitive, relatively cheap, and operate rapidly with short sensor recovery times for repeated analysis with high-throughput capabilities for routinely assessing large volumes of produce in commerce. An olfactory machine (electronic nose) is one modern tool that has been used effectively for determining the food quality [26,27]. These devices are comprised of a multiple-sensor array capable of detecting complex mixtures of volatile organic compounds (VOCs) in the ambient air or headspace volatiles captured from organic samples (plant products), which are converted by transducers from electrical signals into digital signals from the sensor array and collectively assembled to form a smellprint signature uniquely representative of the sample VOC composition. The resulting e-nose output signals are transmitted to a computer and analyzed by software using pattern recognition algorithms [21,24]. Moreover, e-noses may be easily trained to identify different types of aromas from specific plant sources (to maintain food quality controls) using various algorithms and statistical programs.
Electronic noses are particularly useful for analyzing and detecting VOCs due to the cross-reactivity of e-nose multisensory arrays and sensitivity to a diversity of volatile compounds from a wide range of organic chemical classes [24,27]. These gas-sensing devices have a long history of effective use with many applications previously developed in numerous industries, including agriculture and forestry [27], food quality and safety [28][29][30][31], biomedical and forensic sciences [32,33], pharmaceutical and drug development [24], human health and disease diagnoses [34][35][36], and environmental protection [37]. In agriculture, electronic noses are useful in assessing quality controls for foods by virtue of electronic aroma detection (EAD) technologies that can rapidly and repeatedly detect, classify, and characterize complex VOC emissions from food products in a relatively short time. Baietto and Wilson [31] provided a thorough review demonstrating the ability of e-nose devices to detect plant VOCs in many different types of fruits and vegetables.
We propose the hypothesis that the rates of urea fertilizers applied to cucumber plants affect the changes in VOC emissions from cucumber fruits that can be detected and monitored using e-nose devices. The objectives of this study were to: (1) determine the effects of different rates of urea fertilizer applications on cucumber fruit VOC emissions based on measured differences in the collective responses of an experimental MOS enose sensor array; (2) determine whether differences in the e-nose smellprint signatures or sensor array output data may be used as a tool to discriminate and classify cucumbers fruits derived from plants receiving varying rates of urea fertilizer; and (3) evaluate four statistical aroma classification methods (including linear and quadratic discriminant analysis (LDA and QDA), support vector machines (SVM), and artificial neural networks (ANN), for being capable of classifying volatile emissions from cucumber fruits based on e-nose data associated with the amount of urea fertilizer applied to individual cucumber plants.

Sample Preparation
A suitable plant cultivation bed was prepared using soil, animal manure, and sand in a 2:1:1 ratio mixture. Each plot in the cultivation bed was divided into five parts with an equal area of 1.8 m 2 . Appropriate amounts of urea fertilizer were mixed into the soil mixture at 0, 100, 200, 300, and 400-kg urea/ha equivalent application rates before seed sowing. Cucumber seeds (cultivar Beta Alpha F1) were planted uniformly within each cultivation plot located within Razi University's research greenhouse and irrigated every two days. Urea fertilizer was the only treatment variable used in this study. Cultivation and watering treatments were performed equally and uniformly for all plants in all five treatment plots. The levels of urea fertilizer applied to cucumber plants were assumed to directly affect the nitrate levels in the fruits, which we hypothesized could be indirectly detected by changes in VOC emissions from cucumber fruits that would be reflected by differences in the e-nose sensor array output (smellprint) patterns. Following the flowering and fruit set, cucumber fruits were harvested within approximately the same size range (10-12 cm in length) and weight range (50-60 g) and were tested by VOC analysis using an experimental e-nose instrument for two harvests 30 days apart ( Figure 1). The cucumber samples harvested were labeled with the numbers "1" (for the first harvest 4 months after planting seeds) and "2" (for the second harvest 5 months after planting seeds), indicated in the following text as the first and second harvests, respectively.

Electronic Nose System
An electronic-nose or e-nose system detects and differentiates between complex aromas of VOC emissions from biological systems. This analytical device is composed of a set of sensors in an array that react to volatile gases or vapors produced and released by vegetative tissues of a biological sample. The e-nose works as a multisensory detector to sense VOC emissions by measuring the changes in electrical conductivity or resistance of individual sensors caused by a signal response to adsorption of different chemical classes of VOCs to the surface sensor coatings due to interactions between the semiconductor and analyte gas molecules. Changes in electrical conductivity or resistance are received by a transducer that converts the analog signal into digital values from each sensor. Thus, each sensor in the array responds to all the VOC components present in the sample analyte and produces sensor outputs as electrical signals [18]. Sensors within the array respond differently to sample aromas (VOC emissions) from different volatile sample types. The output responses from all the sensors are then sent to computer system records and utilize multivariate data analysis methods to distinguish between differences in sensor data responses to VOCs detected in the plant sample headspace, enabling unknown aromas to be classified and identified [38,39]. Analysis of the e-nose aroma signature (smellprint) patterns produced from each sample type are also often compared using pattern recognition algorithms. The experimental electronic-nose device, constructed in the Department of Mechanical Engineering of Biosystems, Razi University, Kermanshah, Iran [40], consisted of 8 metal oxide (MOS) sensors. The names, gas sensitivity specifications, common gases detected, and detection ranges (ppm) of the eight sensors used in the experimental e-nose sensor array in this study are listed in Table 1.

Electronic Nose System
An electronic-nose or e-nose system detects and differentiates between complex aro mas of VOC emissions from biological systems. This analytical device is composed of set of sensors in an array that react to volatile gases or vapors produced and released b vegetative tissues of a biological sample. The e-nose works as a multisensory detector t sense VOC emissions by measuring the changes in electrical conductivity or resistance o individual sensors caused by a signal response to adsorption of different chemical classe of VOCs to the surface sensor coatings due to interactions between the semiconductor an analyte gas molecules. Changes in electrical conductivity or resistance are received by transducer that converts the analog signal into digital values from each sensor. Thus, eac sensor in the array responds to all the VOC components present in the sample analyte an produces sensor outputs as electrical signals [18]. Sensors within the array respond diffe ently to sample aromas (VOC emissions) from different volatile sample types. The outpu

Instrument Run Parameters
The sensor array contained eight MOS sensors that have relatively low sensitivity to moisture and are very resistant to moisture effects. During the data collection process, the relative humidity and temperature of the VOC sampling chamber environment were maintained in a narrow range of variability (temperature 20 ± 2.0 • C and relative humidity of 30-40%). The room air temperature also was controlled at 20 ± 0.5 • C during the sample preparation and detection to help minimize changes in the carrier input air relative humidity prior to input in the VOC sampling chamber. The e-nose system was equipped with a 12-watt and 12-V (direct current) TYAP127 diaphragm silicon air pump (Gikfun Inc., Dongguan, China), producing a reference and sampling air flow rate of 1.3 L per minute in the VOC sampling chamber. The data recording rate of the e-nose data collection from the sensor array was 100 data points per second. The total run time for the cucumber fruit VOC air sample analysis was 551 s for all samples.

Data Analysis
The results of the sample test were generally collected by sensors and sent to a pattern recognition system that operates using a neural net training system. The aroma data collected from the cucumber fruit VOC emissions using the 8-sensor MOS e-nose sensor array were analyzed with artificial intelligence techniques and statistical analysis methods, including linear discriminate analysis (LDA), quadratic discriminant analysis (QDA), support vector machines (SVM), and artificial neural networks (ANN). These four data analysis methods were compared by their performances in discriminating and classifying e-nose data of fruit VOC emission patterns based on urea fertilizer rates applied in five treatment classes with two harvests per class, resulting in ten total treatment groups analyzed.

Linear and Quadratic Discriminant Analysis
The linear discriminate analysis, Fisher linear resolution, variance analysis (ANOVA), and regression analysis were all combined to find a set of linear equations that could separate two or more sample (aroma) classes. The LDA analyses were conducted by the following five steps. The average of the D-dimensional vectors was calculated from the original data set, which included different sample classes (10 fertilizer treatment groups in this study). The matrix between and within the class was calculated. The special vectors and corresponding special values were calculated based on the between-and within-class matrices. The specific vectors were sorted by reducing the special values, and k-special vectors were selected with the largest special values. Each sample type was converted into a new subspace using the special vector matrix. The main data were predicted to minimize the variance in a group and maximize the distance in different sample groups [41]. The quadratic discriminate analysis utilized a similar procedure, but the statistical model was based on quadratic (polynomial) functions rather than linear functions to correlate the sensor response outputs to different sample classes for multiclass discriminations.

Support Vector Machine
Data mining models such as SVM are among the most popular statistical methods used for e-nose data in recent years. This model is an efficient educational system based on the theory of learning and mathematical optimization using the principle of minimizing structural errors that leads to an optimal final solution. Various performance indicators such as sensitivity, specificity, and accuracy, as well as the desirability function, were used to compare classification systems.

Artificial Neural Network
An ANN system is an algorithm used to compare the classification of ANN samples using pattern recognition to find similarities and differences. Eight neurons were considered for the input layer based on the number of sensors (eight). According to the target output classes, the hidden layer with the optimal number of neurons, and 10 output neurons were considered. From the total dataset, 60% were randomly selected for training, 20% for testing, and 20% for validation. During the training, educational data were presented to the networks, and the networks were adjusted based on their errors. The validation was used to measure the network generalization, as well as to finish the training when general improvements ceased. Since data testing did not affect the training, independent measurements of network performance were provided during and after training. Network training was performed using the Levenberg-Marquardt algorithm, and the errors were estimated using the mean square error (MSE) [42].

Statistical Model Evaluation Metrics
Model evaluation metrics were used to evaluate the performance of the algorithm in supervised learning. Common metrics, including specificity, recall, precision, sensitivity, accuracy, and F-score, were used to analyze the system performance. A confusion matrix uses true positive (TP), false positive (FP), true negative (TN), and false negative (FN) values to calculate metrics [43].
The specificity reflects the proportion of the samples that received a negative result (true negative rate). The precision, also known as the positive predictive value, provided an indication of the repeatability of the data, as indicated by the closeness of the data clustering in the data plots. Recall (R), also known as sensitivity, is defined as the ratio of TP samples to the total TP and FN samples. The accuracy of (P) is defined as the ratio of TP samples to total TP and FP samples. The area under the curve (AUC) is the measure of the ability of a classifier to distinguish between classes and is used as a summary of the ROC curve. The F-score considers the recall and accuracy and was calculated with the following equations [44]:

Electronic-Nose Output Responses
The sensor output signals, collectively assembled from all eight sensors in the 8-MOS experimental e-nose sensor array constituted a full multisensory array output response, commonly referred to as an aroma signature pattern or smellprint. The unique aroma signatures patterns, resulting from the e-nose analysis of the VOCs present in the headspace air samples derived from cucumber fruits harvested from plants receiving different kg/haequivalent application rates of urea fertilizer, are presented in Figure 2.

Electronic-Nose Output Responses
The sensor output signals, collectively assembled from all eight sensors in the 8-MOS experimental e-nose sensor array constituted a full multisensory array output response, commonly referred to as an aroma signature pattern or smellprint. The unique aroma signatures patterns, resulting from the e-nose analysis of the VOCs present in the headspace air samples derived from cucumber fruits harvested from plants receiving different kg/haequivalent application rates of urea fertilizer, are presented in Figure 2. All eight sensors produced measurable responses when exposed to aroma VOC emissions from all five of the cucumber fruit sample types. When examining the differences in the smellprint signatures produced from different treatments, the sensor intensity responses of each sample type generally were compared using the relative position of the sensor intensity responses to the different sample types for each individual sensor. These comparisons are based on the highest intensity response reached by each sensor during the analytical runs. Notice that the relative order of magnitude (of the sensor intensity responses) to the majority of the urea application rate treatments (100, 200, and 400 kg/ha) followed the sensor intensity order (top to bottom, with the curve color indicated) of sensor 6 (MQ135, light green), sensor 7 (TGS2602, dark blue), sensor 1 (MQ3, medium blue), sensor 4 (MQ9, red), sensor 5 (TGS813, dark green), sensor 8 (TGS2620, cyan), sensor 3 (MQ136, gray), and sensor 2 (TGS822, orange). This predominant 6-7-1-4-5-8-3-2 relative-magnitude sensor intensity response pattern was significantly different from the no urea (0 kg/ha) control treatment that contained a different sensor relative magnitude intensity pattern of 6-7-1-5-4-8-3-2 in which two sensor groups (5-4-8 and 3-2) were much more closely clustered (in relative intensity) than the sensor responses to the other urea treatments except for 300 kg/ha. The sensor relative intensity response pattern to the 300 kg/ha fertilizer rate was much like that of the 0 kg/ha rate, but sensor group 6-7-1 was more widely separated for the no urea (0 kg/ha) than for the 300-kg/ha treatment. Subtle differences in the sensor responses to the cucumber VOC sample types collectively contributed to the significant overall differences in the e-nose smellprint signatures.

Linear and Quadratic Discriminant Analysis (LDA and QDA)
VOC emissions from the harvested cucumber fruits were classified using LDA and QDA based on the amount of urea fertilizer applied. Unscrambler software was initiated with data from eight sensors (for the cucumber sample, weighted as 1), while all the inputs were normalized. For LDA, the first two separator functions explained 92% of the total variance of the data and illustrated that the distribution of samples from the 10 cucumber harvest categories having different fertilizer levels were classified with relatively good accuracy, as indicated in Figure 3a. The 10 cucumber harvest categories were defined by, and corresponded to, ten separate cucumber fruit sub-harvests (one for each of the five urea treatments, 0-400 kg/ha equivalents) that occurred on two separate harvest dates, 4 months (first) and 5 months (second) after cucumber seed planting, respectively. Out of the 150 total data points, only 12 were not correctly classified using this method, and the model had a 92% correct detection rate. The QDA analysis indicated a higher accuracy of 98.67%, presented in Figure 3b. Only two samples were not properly classified. plants, yet the QDA method provided a higher overall accuracy in correctly classifying cucumber fruits based on urea application rates. A visual assessment of the data point (scatterplot) distribution patterns for the LDA and QDA models indicates very similar and comparable two-dimensional models that both provide effective and strong representations of the ten sub-harvests (a and b) for the five urea treatments (C0 to C4, indicating increasing rates of 0-400-kg/h urea-applied fertilizer equivalents) from left to right on each corresponding graph (Figure 3a,b). Data clusters by treatment tended to become tighter and more closely clustered as the urea fertilizer rate increased. The polynomial (QDA) model provided a better curve fit for the data than the linear (LDA) model with a higher correlation, indicating a better model for estimating Data clustering within the LDA and QDA two-dimensional aroma map plots provided the means for determining the urea application rates applied to plants derived from the e-nose VOC emissions data of harvested cucumber fruit samples from these plants. Thus, the LDA and QDA aroma map plots served as effective standardized polynomial curves for quantification, correlating the e-nose sensor array data (from fruit VOC emissions) to the urea fertilizer application rates, which resulted in different e-nose smellprint signatures for each urea application rate. Consequently, this information and standard curve models were useful for determining unknown fertilizer application rates that applied to plants from (which future cucumber fruit crops are harvested) by plotting new data from fruit sample unknowns into these standard curves. Comparisons of the LDA and SDA data plot models indicated that the LDA model yielded tighter (more precise) data clusters for the higher urea application rates (1400 and 2400 kg/ha) to cucumber plants, yet the QDA method provided a higher overall accuracy in correctly classifying cucumber fruits based on urea application rates.
A visual assessment of the data point (scatterplot) distribution patterns for the LDA and QDA models indicates very similar and comparable two-dimensional models that both provide effective and strong representations of the ten sub-harvests (a and b) for the five urea treatments (C0 to C4, indicating increasing rates of 0-400-kg/h urea-applied fertilizer equivalents) from left to right on each corresponding graph (Figure 3a,b). Data clusters by treatment tended to become tighter and more closely clustered as the urea fertilizer rate increased. The polynomial (QDA) model provided a better curve fit for the data than the linear (LDA) model with a higher correlation, indicating a better model for estimating the urea concentration levels (based on VOC emissions from cucumber fruits) from cucumber plants receiving unknown levels of urea fertilizer.
The results of the classification accuracy and matrix perturbation for the LDA and QDA statistical classification methods for cucumber fruits, based on the rate of urea fertilizer applied, are presented in Table 2. For the confusion matrix, each data column presents the predicted batch of each sample, whereas each row indicates the actual number of cucumber fruit samples tested by the electronic nose. These results suggest that the QDA method was more accurate and effective than the LDA method, since the number of incorrect classification samples was relatively low. Table 2. Confusion matrix for the 2-group classification of 10 treatment groups of harvested cucumbers (based on the amount of urea fertilizer applied) using the LDA and QDA statistical methods. The functional parameters of the LDA and QDA methods in the classification of cucumbers, based on the rates of urea fertilizer applied, are summarized in Table 3. The data classification accuracy using the LDA and QDA methods was 92.1% and 98.7%, respectively. By comparing the data in Tables 2 and 3, the QDA method was shown to have performed better than the LDA method for cucumber sample classification based on all the performance parameters, including accuracy, precision, recall, specificity, and AUC values. The F-scores (of statistical significance) for the QDA model were significantly higher than for the LDA model for most of the urea fertilizer application rates.

Support Vector Machine
The support vector machine (SVM) model was used in this study to classify the samples based on the C-SVM and Nu-SVM methods, mainly consisting of four functions, including the linear, polynomial, radial, and SVM core types, having either C or Nu penalty coefficients and a γ core coefficient, used by an exponentially growing sequence with a grid search method. Then, the optimal combination of parameters was determined according to the distinction of the calculated hyperplane [45]. Among the total e-nose cucumber sample data collected, 70% were used for training, and 30% were used for testing. The input weighting for all the data inputs was one. The SVM results are summarized in Table 4. 1 Statistical analysis models and parameters used for data analysis: Nu-SVM = Nu Support Vector Machine classification, and C-SVM = C Support Vector Machine classification. Coefficient parameter symbols: c = C-SVM penalty coefficient; Nu = Nu-SVM penalty coefficient; γ = core coefficient.
The C-SVM method showed the highest accuracy for the training and validation of the 10 cucumber harvest groups based on the rates of urea fertilizer applied by a linear function with 95.33% and 94% accuracy, respectively. Additionally, for the same parameters, the Nu-SVM method illustrated the highest accuracy of a radial function by 98% and 94.67%, respectively.
The results of the classification accuracy and perturbation matrix are presented for two methods of C-SVM with a linear function model and Nu-SVM with a radial function model that had higher accuracy (Figure 4). The C-SVM method showed the highest accuracy for the training and validation of the 10 cucumber harvest groups based on the rates of urea fertilizer applied by a linear function with 95.33% and 94% accuracy, respectively. Additionally, for the same parameters, the Nu-SVM method illustrated the highest accuracy of a radial function by 98% and 94.67%, respectively.
The results of the classification accuracy and perturbation matrix are presented for two methods of C-SVM with a linear function model and Nu-SVM with a radial function model that had higher accuracy (Figure 4).   The Nu-SVM method with the radial function model performed better than the C-SVM method with a linear function model, such that only three cucumber fruit samples were misclassified, according to the urea application rate applied (Table 5). Table 5. Confusion matrix for 10 groups of cucumbers based on the amount of urea fertilizer classification using the SVM statistical methods. The functional parameters of C-SVM with the linear basis function and Nu-SVM with the radial nucleus function for the classification of cucumber fruits based on the rates of urea fertilizer applied are provided in Table 6. High levels of significance were observed for all five performance parameters and the F-score for all the urea fertilizer application rates, both for the Nu-SVM (radial-function) model and the C-SVM (linear function) model.

Artificial Neural Network
The lowest entropy error recorded during the training period was 0.526%. The training was stopped only when the output data from the electronic nose indicated an entropy error of just less than 1%. The preliminary study results indicated that during CE training, if allowed to proceed above 1%, the system will begin to retrain and add more samples until it reaches the required entropy error. The confusion matrix in Table 7 shows how cucumbers are classified based on the amount of fertilizer applied. In the perturbation matrix, only six samples were improperly classified, and the accuracy of the classification was 96.2%.  Table 7. Confusion matrix for the 10 cucumber treatment groups (based on the urea fertilizer applied) using the ANN classification methods. 1 Topology: A network topology notation (8-6-10) refers to the arrangement of a network with its nodes and connecting lines. The first number indicates the number of input layers obtained (based on data from eight sensors). The second number shows the number of hidden layers (obtained by trial and error), and the third number indicates the output layers (based on the ten cucumber harvest treatment groups). 2 Fruit harvests: 1 = first harvest 4 months after planting seeds and 2 = second harvest 5 months after planting seeds. 3 Ten cucumber harvest treatment groups based on two harvests for each of the five urea fertilizer application rates.
The functional parameters of ANN for classifying cucumbers, based on the urea fertilizer applied, are presented in Table 8. Accordingly, the network with a structure of 8-6-10 had the highest accuracy for the classifying cucumbers based on fertilizer consumption. All the performance parameters for the ANN, including accuracy, precision, recall, specificity, and AUC values, showed high levels of significance. The F-scores for the ANN model were very high for most of the urea fertilizer application rates, except for the 0 and 100-kg/ha urea fertilizer application rates for harvests 2 and 1, respectively. 1 Topology: A network topology notation (8-6-10) refers to the arrangement of a network with its nodes and connecting lines. The first number indicates the number of input layers obtained (based on data from eight sensors). The second number shows the number of hidden layers (obtained by trial and error), and the third number indicates the output layers (based on the ten cucumber harvest treatment groups). 2 Fruit harvests: 1 = first harvest 4 months after planting seeds and 2 = second harvest 5 months after planting seeds. Table 9 presents the statistical results of the ANN classification models using electronic nose output. The total accuracy (96.7%) and error and cross-entropy values were 3.3 and 0.063, respectively. This model was not under-or overfitted, because the CE values for training were lower than the testing stage values, indicating its high performance. High levels of accuracy (>93%) resulted during the training, validation, and testing stages of the ANN model e-nose data analysis and classifications of the cucumber fruit samples. Table 9. Statistical results of the ANN classification models using the e-nose sensor array outputs for 10 cucumber treatment group samples based on the urea fertilizer treatments.

Stage
Samples Accuracy Error * CE ** The receiver operating characteristic (ROC) shows that the classification of 10 cucumber harvest groups had very high sensitivity as diagrammed in Figure 5. ROC analyses showing that cucumber samples had large areas under the curve (with high F values) provided good indications of effective classification of cucumber sample types, based on e-nose volatile emissions data, for different application rates of urea fertilizer applied to cucumber plants. Figure 6 summarizes test results for the statistical model classification accuracy for 10 cucumber harvest groups based on different quantities of urea fertilizer applied for five treatment (fertilizer rate) groups. The statistical results were calculated using Equations (1)- (6), and their means were reported. Among the models tested, the QDA model showed the best performance in classifying cucumber samples and provided the highest accuracy. The highest recall values of 10 cucumber harvest groups in the QDA and Nu-SVM methods were 98.7% and 98%, respectively. Although precision and recall are valid metrics, one can be optimized at the expense of the other. Consequently, the F-score metric was used. The mean performance parameters of the QDA model (i.e., accuracy, precision, recall, specificity, AUC, and F-score) were 0.997, 0.987, 0.9410, 0.987, 0.987, 0.999, 0.993, and 0.987, respectively. The overall accuracy of all the models was high, and the electronic nose and QDA method were effectively used to classify cucumber fruits into 10 harvest categories, composed of five treatment groups defined by the urea fertilizer applied, with great accuracy.

EER REVIEW
16 of 21 Figure 5. Receiver operating characteristic (ROC) curves for cucumber fruit classification. The analysis of 10 cucumber samples was based on the quantity of urea fertilizer applied using the electronic nose outputs as inputs. Figure 6 summarizes test results for the statistical model classification accuracy for 10 cucumber harvest groups based on different quantities of urea fertilizer applied for five treatment (fertilizer rate) groups. The statistical results were calculated using Equations (1)- (6), and their means were reported. Among the models tested, the QDA model showed the best performance in classifying cucumber samples and provided the highest accuracy. The highest recall values of 10 cucumber harvest groups in the QDA and Nu-SVM methods were 98.7% and 98%, respectively. Although precision and recall are valid metrics, one can be optimized at the expense of the other. Consequently, the F-score metric

Discussion
The effects of variable urea fertilizer application rates on the metabolic processes of cucumber plants were detected by measurable differences in the e-nose sensor array output responses and corresponding smellprint signatures. Comparisons of e-nose aroma profiles provided indications of these differences in the physiological responses of plants (receiving different amounts of urea-nitrogen fertilizer), which were reflected by differences in the VOC profiles of the headspace volatiles derived from cucumber fruits, harvested from the cucumber source plants. Furthermore, aroma signature profiles, derived from the 8-MOS e-nose sensor array outputs, provided relative magnitude sensor intensity response patterns (from the analysis of VOCs of cucumber fruits) that were significantly different for the smellprint signatures of no urea, unfertilized controls, and all four (100-400 kg/ha equivalent) urea-fertilized plants. Additional statistical analyses using various classification models based on e-nose data from cucumber fruit-associated plants receiving the five urea fertilizer treatments provided further evidence of the differential effects of variable nitrogen fertilizer treatments on plant physiology and concomitant effects on fruit volatile emissions and e-nose VOC profiles, as indicated in the following discussions.
The classification accuracy determined in the current study for discriminating between cucumber fruits within five urea treatment groups, based on differences in the VOC emissions, was 92 and 98.67% for the QDA and LDA statistical methods, respectively. The QDA method was more accurate than the LDA method, since the number of incorrect classification samples was relatively low. These results are consistent with those of previous research. Karami et al. [25] investigated the classification of the edible oil shelf life for 150 days using the QDA and LDA methods, which had an accuracy of 95% and 94.4%, respectively. An electronic nose was used to detect fraudulent labeling of virgin olive oil in another study, with the results revealing a classification accuracy greater than 95% for the LDA and QDA [45]. Similarly, the LDA and QDA models, used for classifying apples based on time in storage, were reported to have an accuracy of 80.56% and 83.33% respectively [46]. Khodamoradi et al. [47] reported 95% and 97.78% accuracy for the classification

Discussion
The effects of variable urea fertilizer application rates on the metabolic processes of cucumber plants were detected by measurable differences in the e-nose sensor array output responses and corresponding smellprint signatures. Comparisons of e-nose aroma profiles provided indications of these differences in the physiological responses of plants (receiving different amounts of urea-nitrogen fertilizer), which were reflected by differences in the VOC profiles of the headspace volatiles derived from cucumber fruits, harvested from the cucumber source plants. Furthermore, aroma signature profiles, derived from the 8-MOS e-nose sensor array outputs, provided relative magnitude sensor intensity response patterns (from the analysis of VOCs of cucumber fruits) that were significantly different for the smellprint signatures of no urea, unfertilized controls, and all four (100-400 kg/ha equivalent) urea-fertilized plants. Additional statistical analyses using various classification models based on e-nose data from cucumber fruit-associated plants receiving the five urea fertilizer treatments provided further evidence of the differential effects of variable nitrogen fertilizer treatments on plant physiology and concomitant effects on fruit volatile emissions and e-nose VOC profiles, as indicated in the following discussions.
The classification accuracy determined in the current study for discriminating between cucumber fruits within five urea treatment groups, based on differences in the VOC emissions, was 92 and 98.67% for the QDA and LDA statistical methods, respectively. The QDA method was more accurate than the LDA method, since the number of incorrect classification samples was relatively low. These results are consistent with those of previous research. Karami et al. [25] investigated the classification of the edible oil shelf life for 150 days using the QDA and LDA methods, which had an accuracy of 95% and 94.4%, respectively. An electronic nose was used to detect fraudulent labeling of virgin olive oil in another study, with the results revealing a classification accuracy greater than 95% for the LDA and QDA [45]. Similarly, the LDA and QDA models, used for classifying apples based on time in storage, were reported to have an accuracy of 80.56% and 83.33% respectively [46]. Khodamoradi et al. [47] reported 95% and 97.78% accuracy for the classification of 12 groups of basil plants based on the urea fertilizer applied using the LDA and QDA methods. This is the only known related nitrogen study in which basil was classified based on the amount of nitrogen fertilizer used with the electronic nose. They used artificial neural networks, principal component analysis, linear resolution analysis, and quadratic statistical analysis methods to analyze the data.
The C-SVM method for classifying 10 groups of cucumber harvests in the present study showed the highest accuracy for training and validation by a linear function with 95.33% and 94% accuracy, respectively. Additionally, the Nu-SVM method showed the highest accuracy of the radial basis function with 98% and 94.67%, respectively. Karami et al. [25] reported that their C-SVM method was 100% accurate in determining the shelf life of edible oils using an electronic nose. Ghasemi-Varnamkhasti et al. [48] conducted a study by using response surface methodology (RSM) to describe the freshness of strawberries in polymer packages. Using the C-SVM method and polynomial function for training and validation, they reported 4% and 50.6% accuracy, respectively, whereas, for the Nu-SVM method with the radial basis function, the training accuracy and validation were 85.2% and 55.6%, respectively. In addition, Gorji-Chakespari et al. [49] studied the classification of Damask rose essential oil and reported 99% accuracy. Khorramifar et al. [50] utilized similar machine learning methods to classify and identify the potato cultivars based on a MOS e-nose sensor array. Karami et al. [21] used the SVM method with a linear basis function for training and validation and reported a classification accuracy of 98% and 97%, respectively.
Classifying cucumbers using the ANN method (with a topology structure of 8-6-10) resulted in the highest total accuracy (96.7%), and the error and cross-entropy values were 3.3 and 0.063, respectively. The results of cucumber classification using artificial neural networks in this study were consistent with those of other researchers [22,27,37,51]. In a study conducted by Rasekh et al. [52] to classify edible essential oils, these researchers reported that the accuracy of two-group and six-group classifications of essentials oils by an ANN was 100% and 98.9%, respectively.

Conclusions
An experimental electronic nose device, consisting of eight commercial MOS gas sensors used in combination with four statistical models (LDA, QDA, SVM, and ANN) in this study, was capable of classifying cucumber fruits based on differences in the enose sensor array outputs resulting from different amounts of urea fertilizer applied to cultivated plants. The MOS e-nose sensor array data indicated that the differences in the VOC emissions from cucumber fruits resulted from different rates of urea fertilizer applied to cultivated plants. These differences in the e-nose output data suggest that different urea fertilizer application rates (as the only treatment variable) sufficiently affected the physiology of cucumbers fruits, causing differences in the VOC emissions. The QDA method and Nu-SVM support vector method with a radial basis yield (RBF) both indicated a high accuracy for the classification of cucumber fruits derived from plants receiving different levels of urea fertilizer. These results suggest that the MOS e-nose sensor array output data, displayed as smellprint signatures and aroma data plot maps (from the QDA and LDA statistical models), were capable of indirectly detecting differences in the nitrogen fertilizer rates applied to plants. This new e-nose tool potentially could be used to monitor urea fertilizer application rates applied to cucumbers and other agricultural crops with similar research to facilitate the adjustments of future urea application schedules to avoid overfertilization. Subsequent follow-up research is needed to correlate the specific VOC emission signatures to actual nitrate levels in fruits to achieve the maximum utility of this new method as an effective fertilizer application monitoring tool. This capability could eventually help mitigate the negative human health effects associated with the frequent ingestion of fresh produce with high nitrate levels, particularly common vegetables like cucumbers, due to the high availability of these foods in commercial produce markets, short time periods between fertilization and harvesting, and multiple harvests per season.