Prediction of Almond Nut Yield and Its Greenhouse Gases Emission Using Different Methodologies

The evaluation of a production system to analyze greenhouse gases is one of the most interesting challenges for researchers. The aim of the present study is to model almond nut production based on inputs by employing artificial neural networks (ANNs) and adaptive neuro-fuzzy inference systems (ANFIS) procedures. To predict the almond nut yield with respect to the energy inputs, several ANN and ANFIS models were developed, evaluated, and compared. Among the several developed ANNs, a network with an architecture of 8-12-1 and a log-sigmoid, and a linear transfer function in the hidden and output layers, respectively, is found to be the best model. In general, both approaches had a good capability for predicting the nut yield. The comparison results revealed that the ANN procedure could predict the nut yield more precisely than the ANFIS models. Furthermore, greenhouse gas (GHG) emissions in almond orchards are determined where the total GHG emission is estimated to be about 2348.85 kg CO2eq ha−1. Among the inputs, electricity had the largest contribution to GHG emissions, with a share of 72.32%.


Introduction
Today, production in modern agricultural systems is directly affected by energy inputs [1]. Every parts in the industrial food system depends on consuming fossil fuels, from fertilizers production, to the processing and transporting of food products to market [2]. In general, oil and petroleum products comprise the main portion of the total energy consumption of the agricultural sector [3].
Energy production and consumption negatively impact the environment, mainly through the emission of greenhouse gases (GHGs) and air pollutants [4]. Therefore, as an imperative step towards sustainable agriculture and food production, energy sources must be used more efficiently [5]. A simple energy input-output evaluation is usually practiced to determine the efficiency of energy consumption in agricultural production systems. Numerous studies have performed this procedure on various crops, such as apples [6], grapes [7], cherries [8], rice [9], potatoes [10], and tangerines [11].
Farmers, insurers, and governments could effectively manage the risk of production by employing an accurate prediction method. Based on reviewing previous studies of Table 1. Some studies reported for ANFIS application in the agricultural sector.

System/Process Summary Reference
Convective drying ANFIS showed a good ability for predicting the drying properties for three products (i.e., potato, garlic, and cantaloupe.) [29] Wheat production The grain yield of the wheat was successfully predicted by ANFIS approach. [30] Wheat grain yield was successfully predicted by ANFIS based on the energy inputs. [31] Broiler production The broiler farms were analyzed to estimate energy outputs using ANFIS method. [22] To estimate the best body weight and feed the conversion ratio, ANFIS was evaluated for determination of the first three limiting amino acids. [32] Landslide susceptibility mapping The findings of the ANFIS model were manifested using remote sensing data integrated with GIS for landslide susceptibility evaluation. [33] Landslide susceptibility plotting was carried out using optimized ANFIS by the teaching-learning-based optimization and satin bowerbird optimizer algorithms. [34] Appl. Sci. 2022, 12, 2036 3 of 12 According to the preceding discussion, the main objectives of this work were to analyze GHG emitted from almond production and to predict almond nut yield using energy inputs by means of AI modeling techniques, including ANN and ANFIS models.

Data Collection
The well-known almond variety, namely, Mamaei, in the Chaharmahal and Bakhtiari provinces, as the main almond production regions in Iran, was selected for present study (Figure 1).
Landslide susceptibility plotting was carried out using optimized ANFIS by the teaching-learning-based optimization and satin bowerbird optimizer algorithms. [34] According to the preceding discussion, the main objectives of this work were to analyze GHG emitted from almond production and to predict almond nut yield using energy inputs by means of AI modeling techniques, including ANN and ANFIS models.

Data Collection
The well-known almond variety, namely, Mamaei, in the Chaharmahal and Bakhtiari provinces, as the main almond production regions in Iran, was selected for present study (Figure 1). The Cochran formula (Equation (1)) [11] was employed to specify the required sample size (n) and over three consecutive years, data were gathered from the randomly selected gardeners using in-person interviewing.
In Equation (1), N is the number of holdings in the target population; t is the reliability coefficient; S 2 is the variance of studied qualification in population; d is the precision.
In brief, in the studied region, the cropping system and the field experiments were found to be commonly as follows: • Tree planting was a square pattern with 5-6 m distance between adjacent almond trees; • Two periods of pruning the trees; • Using both chemical fertilizers and farmyard manure; • Spraying chemicals three times per year; • Using electrical pumps to transport water from the local river to the orchards for drip irrigation.

Greenhouse Gas (GHG) Emissions
In the studied region, the interviews disclosed that human labor (x1), machinery (x2), fossil fuel (x3), chemical fertilizers (x4), livestock manure (x5), chemicals (x6), irrigation water (x7), and electrical energy (x8) were the main input parameters for almond production. The Cochran formula (Equation (1)) [11] was employed to specify the required sample size (n) and over three consecutive years, data were gathered from the randomly selected gardeners using in-person interviewing.
In Equation (1), N is the number of holdings in the target population; t is the reliability coefficient; S 2 is the variance of studied qualification in population; d is the precision.
In brief, in the studied region, the cropping system and the field experiments were found to be commonly as follows: • Tree planting was a square pattern with 5-6 m distance between adjacent almond trees; • Two periods of pruning the trees; • Using both chemical fertilizers and farmyard manure; • Spraying chemicals three times per year; • Using electrical pumps to transport water from the local river to the orchards for drip irrigation.

Greenhouse Gas (GHG) Emissions
In the studied region, the interviews disclosed that human labor (x 1 ), machinery (x 2 ), fossil fuel (x 3 ), chemical fertilizers (x 4 ), livestock manure (x 5 ), chemicals (x 6 ), irrigation water (x 7 ), and electrical energy (x 8 ) were the main input parameters for almond production. To calculate the greenhouse gas (GHG) emission, the relevant energy inputs were multiplied by their corresponding carbon emission equivalence as reported by Khoshnevisan et al. [23].

•
Based on the review of published papers and the reported results in the case of ANN applications for various purposes, the multilayer perceptron (MLP) artificial neural network model was selected. Detailed information about the MLPs can be found in Beigi et al. [35].

•
The above-mentioned parameters in Section 2.2 were employed as the network inputs and the almond nut yield was considered as the output; • After shuffling, the collected dataset was divided into three subsets including training (70%), validation (15%), and testing (15%) of the modeling networks; • Numerous neural network topologies were trained by employing different approaches including Levenberg-Marquardt, gradient descent (gd), gradient descent with momentum (gdm), and gradient descent with momentum adaptive learning rate backpropagation (gdx); • The logistic sigmoid (logsig) and tangent sigmoid (tansig) transfer functions were employed in the hidden layers. The linear transfer function (purelin) was examined for output layer [29]; • Trial-and-error method was performed in order to realize the most accurate ANN model; • To test each network, different arrangements of the training set were used, some modifications were made to make the network more reliable, and finally, performance of the system assessed by subjecting the ANN to the new input configurations.

Development of ANFIS
A typical rule set for the common first order Takagi-Sugeno fuzzy model with two fuzzy if-then rules is presented as follows [22]: Rule 1: If (x is A 1 ) and (y is B 1 ), then: Z 1 = p 1 x + q 1 y+ r 1 (2) Rule 2: If (x is A 2 ) and (y is B 2 ), then: Z 2 = p 2 x + q 2 y+ r 2 where, p 1 , p 2 , q 1 , q 2 , r 1 , and r 2 are linear, and A 1 , A 2 , B 1 , and B 2 are non-linear parameters. A summarized architecture with two inputs and one output for the corresponding equivalent ANFIS is presented in Figure 2. Circle and square in the figure indicate fixed and adaptive nodes, respectively.
To calculate the greenhouse gas (GHG) emission, the relevant energy inputs were multiplied by their corresponding carbon emission equivalence as reported by Khoshnevisan et al. [23].

Development of ANN Models
In this study, the ANNs were implemented by using the MATLAB software (R2013a) and the best ANN for predicting the number of almond nuts was attained by following the subsequent steps [35]: • Based on the review of published papers and the reported results in the case of ANN applications for various purposes, the multilayer perceptron (MLP) artificial neural network model was selected. Detailed information about the MLPs can be found in Beigi et al. [35].

•
The above-mentioned parameters in Section 2.2 were employed as the network inputs and the almond nut yield was considered as the output; • After shuffling, the collected dataset was divided into three subsets including training (70%), validation (15%), and testing (15%) of the modeling networks; • Numerous neural network topologies were trained by employing different approaches including Levenberg-Marquardt, gradient descent (gd), gradient descent with momentum (gdm), and gradient descent with momentum adaptive learning rate backpropagation (gdx); • The logistic sigmoid (logsig) and tangent sigmoid (tansig) transfer functions were employed in the hidden layers. The linear transfer function (purelin) was examined for output layer [29]; • Trial-and-error method was performed in order to realize the most accurate ANN model; • To test each network, different arrangements of the training set were used, some modifications were made to make the network more reliable, and finally, performance of the system assessed by subjecting the ANN to the new input configurations.

Development of ANFIS
A typical rule set for the common first order Takagi-Sugeno fuzzy model with two fuzzy if-then rules is presented as follows [22]: Rule 1: If (x is A1) and (y is B1), then: Z1 = p1 x + q1 y+ r1 (2) Rule 2: If (x is A2) and (y is B2), then: Z2 = p2 x + q2 y+ r2 where, p1, p2, q1, q2, r1, and r2 are linear, and A1, A2, B1, and B2 are non-linear parameters. A summarized architecture with two inputs and one output for the corresponding equivalent ANFIS is presented in Figure 2. Circle and square in the figure indicate fixed and adaptive nodes, respectively. In the first layer, each node creates the membership grades for their proper fuzzy sets. The outputs are given as follows [18]: In the first layer, each node creates the membership grades for their proper fuzzy sets. The outputs are given as follows [18]: Appl. Sci. 2022, 12, 2036

of 12
In Equation (4), µ A i and µ B i−2 are the degrees of membership functions. By using the bell-shape membership function, µ A i (x) is given as follows by Equation (5) [33]: where, a i , b i and c i are known as adjustable parameters. The fixed nodes of the second layer make a simple multiplication process. The outputs of the layer (firing strengths of the rules) could be calculated as follows (Equation (6)) [29]: In the third layer, the ratio of one firing strength to the total rules' firing strengths is calculated. The outputs of normalized layer can be represented in the following form: The fourth is the defuzzification layer. The outputs of this layer are displayed as follow: In Equation (8), the three modifiable parameters {p i , q i , r i } relate to the first-order polynomial.
In the fifth layer, the single fixed-node described with S performs the total incoming signals as follows [36]: Since the number of inputs in the present study was eight (x 1 , x 2 , x 3 , x 4 , x 5 , x 6 , x 7 , and x 8 ), two main schemes were developed by employing MATLAB software version 7.14.0.739 (R2012a), and the results were compared to obtain the best ANFIS algorithm. In the first scheme, the input variables were divided into four groups and each group was taken as an input variable for each ANFIS network ( Figure 3). Therefore, the outputs of ANFIS 1 and ANFIS 2 were chosen as inputs for ANFIS 5. Similarly, the outputs of ANFIS 3 and ANFIS 4 were fed as inputs to ANFIS 6. Finally, the outputs of ANFIS 5 and ANFIS 6 made ANFIS 7 and the output almond nut yield was forecasted.
In Equation (4), and are the degrees of membership functions. By using the bell-shape membership function, ( ) is given as follows by Equation (5) [33]: where, ai, bi and ci are known as adjustable parameters. The fixed nodes of the second layer make a simple multiplication process. The outputs of the layer (firing strengths of the rules) could be calculated as follows (Equation (6)) [29]: In the third layer, the ratio of one firing strength to the total rules' firing strengths is calculated. The outputs of normalized layer can be represented in the following form: The fourth is the defuzzification layer. The outputs of this layer are displayed as follow: In Equation (8), the three modifiable parameters {pi, qi, ri} relate to the first-order polynomial.
In the fifth layer, the single fixed-node described with S performs the total incoming signals as follows [36]: Since the number of inputs in the present study was eight (x1, x2, x3, x4, x5, x6, x7, and x8), two main schemes were developed by employing MATLAB software version 7.14.0.739 (R2012a), and the results were compared to obtain the best ANFIS algorithm. In the first scheme, the input variables were divided into four groups and each group was taken as an input variable for each ANFIS network ( Figure 3). Therefore, the outputs of ANFIS 1 and ANFIS 2 were chosen as inputs for ANFIS 5. Similarly, the outputs of ANFIS 3 and ANFIS 4 were fed as inputs to ANFIS 6. Finally, the outputs of ANFIS 5 and ANFIS 6 made ANFIS 7 and the output almond nut yield was forecasted.  In the second ANFIS scheme, the inputs were split up into three groups and each group was chosen as an input for ANFIS networks 1 to 3 ( Figure 4). Eventually, The ANFIS 4 was made of the output values from ANFIS 1-3 to predict the almond nut yield.
In the second ANFIS scheme, the inputs were split up into three groups and each group was chosen as an input for ANFIS networks 1 to 3 ( Figure 4). Eventually, The AN-FIS 4 was made of the output values from ANFIS 1-3 to predict the almond nut yield.

Performance Evaluation and Error Analysis of ANN and ANFIS Models
Powerfulness of the proposed networks was performed through three criteria, mean square error (MSE), root mean square error (RMSE), and mean absolute percentage error (MAPE).
where, Pi and Ai represent the anticipated and real value for the ith farmer, and n is the number of the points in the given input data.

Greenhouse Gas Emissions
The average amount and share of GHG emissions from different inputs in the almond yield are represented in Table 2.

Performance Evaluation and Error Analysis of ANN and ANFIS Models
Powerfulness of the proposed networks was performed through three criteria, mean square error (MSE), root mean square error (RMSE), and mean absolute percentage error (MAPE).
where, P i and A i represent the anticipated and real value for the ith farmer, and n is the number of the points in the given input data.

Greenhouse Gas Emissions
The average amount and share of GHG emissions from different inputs in the almond yield are represented in Table 2. Table 2. Amounts and shares (%) of equivalent greenhouse gas emission of inputs in the almond production.

Inputs
GHG Emission (kg CO 2,eq ha −1 ) Percentage (%) As shown, the overall GHG emissions were 2348.85 kg CO 2eq ha −1 , which is very close to the values reported by other studies for crop production. The GHG emissions for wheat production were reported to be between the values of 410 and 1130 kg CO 2eq ha −1 depending on fertilizer rate, location, and seeding system [37]. The total emissions for potato production were calculated at 2350 kg CO 2eq ha −1 by Ferreira et al. [38], while Pishgar-Komleh et al. [39] calculated 992.88 kg CO 2eq ha −1 . The total GHG emissions for wheat production were reported to be about 1038 CO 2eq ha −1 [40]. Taghavifar and Mardani [28] estimated the GHG emissions for the apple production to be 1195.79 CO 2eq ha −1 .
Furthermore, the major part (72.32%) of the GHG emissions in the production of the Mamaei almond nut belonged to electricity, followed by chemical fertilizers (8.58%) and chemicals (7.54%).

Evaluation of ANN Models
To achieve a model to predict the almond nut yield based on input energies, several ANN algorithms with various topologies and learning structures were proposed. Furthermore, various hidden layers and neurons comprising each layer, as well as transfer functions, were used to produce the architecture of the models. Table 3 lists the performance of the studied ANN networks. As highlighted in the table, the topology of 8-12-1 was found to be the best network structure. The log-sigmoid and linear transfer functions were employed in the hidden and output layers, respectively. The calculated MSE, RMSE, and MAPE for the best algorithm were 0.186, 0.431, and 0.041, respectively. The correlation coefficient between the measured data and the predicted ones by the best ANN model is shown in Figure 5. Khoshroo et al. [1] developed various multilayer ANN models to predict grape yield and found the 7-6-1 architecture to be the best model.
Practicing several artificial neural networks with different topologies and learning algorithms was performed by Khoshnevisan et al. to predict the amount of consumed energy for tomato production in the greenhouse [23]. The researchers employed several types of activation functions (i.e., logistic sigmoid, tangent sigmoid, and purelin) as well as various hidden layers and neurons in every hidden layer. They reported that the best prediction was obtained by the network topology of 10-20-7-9-1, with the tangent sigmoid and purelin transfer functions employed in the hidden layers and the target layer, respectively. Furthermore, among the different training algorithms, the LM algorithm produced the best result. For sugarcane production in planted or ratoon farms, Kaab et al. [41] used ANNs to predict life cycle assessment and output energy, and they found the best ANN model with 9-10-5-11 and 7-9-6-11 topologies, respectively. Furthermore, in training for environmental impacts and output energy, the researchers reported the R 2 in the range from 0.923-0.986 in planted farms and 0.942-0.982 in ratoon farms. Based on obtained data from a time series (1961-2016), Abraham et al. [42] practiced the ANN method to estimate some soybean harvest parameters such as the area, yield, and production in Brazil, and compared the results with classical methods of time series analysis. They stated that, in the case of harvest area and production, ANN was the best approach, while a classical linear function was more effective for the yield prediction. Adisa et al. [43] employed an ANN approach to predict the maize production in South Africa based on climate variable inputs including precipitation, maximum and minimum temperatures (TMX), potential evapotranspiration, soil moisture, and cultivated land. Appl. Sci. 2022, 12, x FOR PEER REVIEW 8 of 13 Figure 5. Comparison between measured and predicted values of the almond nut yields for the training, validation, and testing of the optimal ANN algorithm.
Khoshroo et al. [1] developed various multilayer ANN models to predict grape yield and found the 7-6-1 architecture to be the best model.
Practicing several artificial neural networks with different topologies and learning algorithms was performed by Khoshnevisan et al. to predict the amount of consumed energy for tomato production in the greenhouse [23]. The researchers employed several types of activation functions (i.e., logistic sigmoid, tangent sigmoid, and purelin) as well as various hidden layers and neurons in every hidden layer. They reported that the best prediction was obtained by the network topology of 10-20-7-9-1, with the tangent sigmoid and purelin transfer functions employed in the hidden layers and the target layer, respectively. Furthermore, among the different training algorithms, the LM algorithm produced the best result. For sugarcane production in planted or ratoon farms, Kaab et al. [41] used ANNs to predict life cycle assessment and output energy, and they found the best ANN model with 9-10-5-11 and 7-9-6-11 topologies, respectively. Furthermore, in training for environmental impacts and output energy, the researchers reported the R 2 in the range from 0.923-0.986 in planted farms and 0.942-0.982 in ratoon farms. Based on obtained data from a time series (1961-2016), Abraham et al. [42] practiced the ANN method to estimate some soybean harvest parameters such as the area, yield, and production in Brazil, and compared the results with classical methods of time series analysis. They stated that, in the case of harvest area and production, ANN was the best approach, while a classical linear function was more effective for the yield prediction. Adisa et al. [43] employed an ANN approach to predict the maize production in South Africa based on climate variable inputs including precipitation, maximum and minimum temperatures (TMX), potential evapotranspiration, soil moisture, and cultivated land.

Evaluation of ANFIS Models
The two main ANFIS architectures were developed to find the effectivity of ANFIS topology for predicting the almond nut yield based on energy inputs. To attain the best result, the following key modifications were made: I: type of input MFs (triangular, trapezoidal, bell, Gaussian, and sigmoid); II: type of the output MFs (fixed or linear); III: number of input and output MFs, the optimization method (hybrid or backpropagation); IV: number of epochs. The optimal findings for the first ANFIS model are shown in Table 4. From the results, Gaussian MF, along with hybrid learning methods, yielded the best result. The hybrid learning method combines least-squares (LS) and back propagation (BP) algorithms. LS estimates the parameters associated with output MF and BP tunes the parameters associated with input MF [44]. The epoch number for training the model was selected as 40, since more epochs lead to a very small variation of error. Furthermore, it has been stated that the hybrid optimization method causes better results than the propagation learning algorithm.
The total number of parameters in the network is assessed by the number of MFs for input factors. The information from ANFIS for the first model is shown in Table 5. The overall number of training data sets was evaluated to be 119, and the overall number of parameters for ANFIS1-ANFIS7 was 28, representing that the number of MFs for inputs was chosen suitably. The MSE, RMSE, and MAPE for the final ANFIS network were obtained as 0.295, 0.543, and 0.055, respectively. Evaluation among the findings of the three steps exposes that the statistical factors of the second step (including ANFIS 5 and 6) were higher than those of the first step, and subsequently, they were lower than the values of the ANFIS 7. The best outputs of the second ANFIS topology are demonstrated in Table 6. From the table, the Gaussian combined with linear MFs along with the hybrid learning techniques resulted in the best prediction. The characteristics of the best prediction for the second ANFIS model are illustrated in Table 7. As shown in Figure 4, three input parameters were entered into ANFIS 1, 2, and 4. Therefore, the number of MFs was determined as 2, 2, 2 for ANFIS 1, 2 and 3, 3, 3 for ANFIS 4, while for ANFIS 3, it was selected as 2, 2. Correspondingly, the total number of parameters for ANFIS 1-2 and ANFIS 4 was computed to be 44 and 126, respectively, while for ANFIS 3 the parameter was obtained to be 20. The MSE, RMSE, and MAPE for the ANFIS 4 were determined to be 0.290, 0.538, and 0.048, respectively, which means that the second ANFIS model can estimate output energy with high accuracy. In a case study, Naderloo et al. [30] employed ANFIS to predict the grain yield of wheat in Iran. Due to eight inputs, they clustered the input vector for ANFIS into two groups and trained two networks. Diesel fuel, fertilizer, and electricity energies were employed as the input variables for ANFIS 1, and human labor, machinery, chemicals, water for irrigation, and seed energies were considered for ANFIS 2. They found the RMSE and R 2 were 0.013 and 0.996 for ANFIS 1, and 0.018 and 0.992 for ANFIS 2, respectively. Finally, they used the predicted values of the two networks as the inputs to the third ANFIS and found that the RMSE and R 2 values for ANFIS 3 were 0.013 and 0.996, respectively. In a study conducted by Khoshnevisan et al. [23] to model energy consumption in tomato production, the researchers reported that the combination of Gbell and linear MFs as well as a hybrid learning technique resulted in the best prediction.

Comparison among ANN and ANFIS Approaches
In general, ANFIS models were able to work with uncertain, noisy, and imprecise data, particularly the data related to agricultural production processes, thus these models are composed of ANN and fuzzy system models. As presented, the ANFIS and ANN models had good accuracy in predicting the nut yield. However, comparing the results revealed that the ANN estimated the almond nut yield more accurately than the ANFIS models. The results are contrary to the findings reported by Khashei-Siuki et al. [45] to predict wheat yield and Khoshnevisan et al. [23] for tomato production in the greenhouse.

Conclusions
Experimental and modeling investigations into input-output energy patterns in almond orchards in Chaharmahal and Bakhtiari provinces, Iran were conducted. The total GHG emissions were about 2348.85 kg CO 2eq ha −1 and electricity had the key role, followed by chemical fertilizers and chemicals. According to the obtained results in the present study, it could be concluded that using renewable energy resources such as solar and/or wind power generators can help farmers improve energy use efficiency, sustainability, and their production as well as reduce GHG emissions. To estimate the almond nut yield with respect to the energy inputs, several ANN and ANFIS models were developed, evaluated, and compared. In general, both approaches had a good capability in predicting the nut yield. Furthermore, the ANN models forecast the nut yield more accurately in comparison with the ANFIS models.