Predicting Parameters of Heat Transfer in a Shell and Tube Heat Exchanger Using Aluminum Oxide Nanoﬂuid with Artiﬁcial Neural Network (ANN) and Self-Organizing Map (SOM)

: This study is a model of artiﬁcial perceptron neural network including three inputs to predict the Nusselt number and energy consumption in the processing of tomato paste in a shell-and-tube heat exchanger with aluminum oxide nanoﬂuid. The Reynolds number in the range of 150–350, temperature in the range of 70–90 K, and nanoparticle concentration in the range of 2–4% were selected as network input variables, while the corresponding Nusselt number and energy consumption were considered as the network target. The network has 3 inputs, 1 hidden layer with 22 neurons and an output layer. The SOM neural network was also used to determine the number of winner neurons. The advanced optimal artiﬁcial neural network model shows a reasonable agreement in predicting experimental data with mean square errors of 0.0023357 and 0.00011465 and correlation coefﬁcients of 0.9994 and 0.9993 for the Nusselt number and energy consumption data set. The obtained values of e MAX for the Nusselt number and energy consumption are 0.1114, and 0.02, respectively. Desirable results obtained for the two factors of correlation coefﬁcient and mean square error indicate the successful prediction by artiﬁcial neural network with a topology of 3-22-2.


Introduction
Adding nanoparticles to a base fluid affects the thermophysical characteristics of the fluid [1][2][3][4][5]. Several studies are conducted on the effects of adding nanoparticles on the heat transfer of nanofluids [6][7][8][9][10].Wanatasanapan et al. [11] investigated the influence of TiO 2 -Al 2 O 3 nanoparticle mixing ratio on the thermal conductivity, rheological properties and dynamic viscosity of water-based hybrid nanofluid. Li et al. [12] estimated the stability and thermal performance of Al 2 O 3 -ethylene glycol (EG) nanofluids under ultrasonic conditions. Their results showed that Al 2 O 3 -EG nanofluids obtained by ultrasonation for 60 min showed the most encouraging properties. Sekhar et al. [13] prepared cobalt oxide-water nanofluid and studied its thermal and physical properties. Based on their results, relative viscosity values decreased with temperature and increased with the volume fraction of nanoparticles. The thermal conductivity ratio of the nanofluid increased, too. Gu et al. [14] evaluated thermal conductivity and viscosity properties of water-based nanofluid containing carbon nanotubes decorated with Ag nanoparticles. Based on their findings, thermal conductivity (k) of the nanofluid increased with the thermal filler loading and the decoration quantity of Ag nanoparticles. Bahmani et al. [15] investigated heat transfer and turbulent flow of water/alumina nanofluid in a double pipe heat exchanger. The results indicated that increasing the nanoparticles volume fraction or Reynolds number caused enhancement of Nusselt number and convection heat transfer coefficient. Maddah et al. [16] presented a factorial experimental design for the thermal performance of a double pipe heat exchanger using Al 2 O 3 -TiO 2 hybrid nanofluid. They found that using nanocomposites and twisted tapes increased the exergy efficiency compared to utilizing conventional water as a heat transfer fluid. Goodarzi et al. [17] studied heat transfer and pressure drop of a counter flow corrugated plate heat exchanger using MWCNT-based nanofluids. It was shown that performance of the plate heat exchanger can be enhanced using MWCNT/water as the working fluid. Fares et al. [18] analyzed heat transfer of a shell and tube heat exchanger operating with graphene nanofluids. The results indicated that using graphene/water nanofluids enhanced the thermal performance of the shell and tube heat exchanger. Cox et al. [19] used nanofluids in a shell-and-tube heat exchanger. They found both augmentation and deterioration of heat transfer coefficient for nanofluids depending on the flow rate through the heat exchangers. In a study by Shahsavar et al. [20], impact of variable fluid properties on forced convection of Fe 3 O 4 /CNT/water hybrid nanofluid in a double-pipe mini-channel heat exchanger was assessed. The results showed that the non-Newtonian hybrid nanofluid always had a higher heat transfer rate, overall heat transfer coefficient and effectiveness than those of the Newtonian hybrid nanofluid. Maddah et al. [21] investigated viscosity and thermal conductivity of hybrid Cu/CNT water-based nanofluids at various concentrations of nanofluid and temperatures. The results demonstrated that although increased concentration resulted in enhancement of the thermal conduction coefficient and viscosity, the increase in temperature followed the expected results of increasing thermal conductivity and decreasing viscosity. Ghasemi et al. [22] predicted and optimized exergetic efficiency of TiO 2 -Al 2 O 3 /water nanofluid at different Reynolds numbers, volume fractions and twisted ratios using artificial neural networks and experimental data. The findings indicated successful prediction by the network. Kahani et al. [23] developed multilayer perceptron artificial neural network (MLP-ANN) and least square support vector machine (LSSVM) models to predict Nusselt number and pressure drop of TiO 2 /water nanofluid flows through non-straight pathways. Based on the output results of developed models, MLP-ANN model was able to predict both Nusselt number and pressure drop of nanofluid flow more precisely compared to the LSSVM model. The present study is aimed at predicting Nusselt number and energy consumption using artificial neural network in processing and preparing tomato paste based on the laboratory study by Jafari et al. [24]. Ahmadi et al. [25] used artificial neural network to predict thermo-physical properties of TiO 2 -Al 2 O 3 /water nanoparticles. Their results showed that the SOM and BP-LM algorithms can be considered excellent tools for predicting thermal conductivity. Zahir Shah et al. [26] investigated the impact of nanoparticle shape and radiation on the behavior of Al 2 O 3 /H 2 O nanofluid under the Lorentz forces. Based on their results, convective thermal energy transportation enhances with augmenting buoyancy forces, radiation parameter and nanoparticles shape factor.
Esfe [27] designed a neural network for predicting the thermal conductivity of ZnO-MWCNT/EG-water hybrid nanofluid for engineering applications. It was found that the neural network is able to predict the data. Moreover, the data regression coefficients indicated the high accuracy of the applied method. In their research, Zahir Shah et al. [28] studied heat transfer intensification of nanomaterial with involve of swirl flow device concerning entropy generation. The characteristics of thermal energy transfer of hybrid nanofluid were investigated by varying the pitch ratio (P) of the helical turbulator and Reynolds number (Re) of the fluid. The obtained results indicated that making the fluid more turbulent by increasing Re decreases the temperature of the fluid while increasing the fluid velocity. Numerical modeling on hybrid nanofluid (Fe 3 O 4 +MWCNT/H 2 O) migration considering MHD effect over a porous cylinder was examined by Zahir Shah et al. [29]. They concluded that the enhancing medium porosity, buoyancy forces and radiation parameter increased the free convective thermal energy flow. Zahir Shah et al. [30] formulated fractional dynamics of HIV with a source term for the supply of new CD4+ T-cells depending on the viral load via the Caputo-Fabrizio derivative. Their numerical study was focused on the path-tracking damped oscillatory behavior of a model for the HIV infection of CD4+ T-cells. Other studies are shown in Table 1. First, using a self-organizing map (SOM) neural network, the winner neuron is identified and then this winner neuron will be used in the perceptron artificial neural network to predict the data. Input data include temperature, nanofluid concentration and Reynolds number, and output data are Nusselt number and energy consumption.

Experimental Data
The experimental results obtained by Jafari [24], who reported Aluminum oxide nanoparticles in shell and tube heat exchanger, were used in the temperature range of 70-90 K, Reynolds number range of 150-350, and nanoparticle concentration range of 2-4%. Figure 1 represents the Nusselt number of shell and tube heat exchanger with Al 2 O 3 -water nanofluid. Red, green and blue colours represent the Nusselt number changes in terms of Reynolds number at concentration 0, 2, and 4%, respectively. Figure 2 shows energy consumption and processing time for nanofluids and water in thermal processing of tomato juices. Red, and green colours represent energy and time, respectively.

Multilayer Perceptron (MLP) Neural Network
One of the best artificial neural networks for solving complex and nonlinear problems is MLP network with supervised training and error back-propagation algorithm [37]. In general, MLP networks have three layers of input, hidden and output, each layer having a number of processing units called neurons. Each neuron receives a weighted output from the previous neuron and passes it from an activation or threshold function. These functions can be of different types such as sigmoid, Gaussian, linear and binary. The training basis of these networks is to change the weight of the connections in order to achieve the desired output. To this end, first a model is presented to the network and then its output is calculated. Comparing this output with the actual value, the error rate is calculated to correct the network weights. The error is then propagated into the network and the weights are reset. This cycle continues until the sum of square errors is minimized [37]. A network is considered generalized when the amount of prediction error is acceptable using the data already introduced to the network (experimental data). Therefore, in modeling with these networks, from the beginning, the data should be divided into two categories: experimental and training data. The training model should cover the entire data space as much as possible.

Radial Base Function Model
The design of this network is faster than that of the MLP network, and it learns faster. These networks perform well when many input vectors are available and can approximate any logical function with a sufficient number of neurons. The network consists of three layers including an input layer, a hidden layer with a nonlinear active radius function, and a linear output layer. The most important nature of RBF networks is that the hidden layer neurons in the middle of the base function have reaction only from the location of the input function, and for this reason, when the input space falls in only one area, the primary function can generate a significant non-zero response; otherwise, the output is smaller than the base functions.
where M is the number of base functions, X is the input data vector, w kj represents the weight connected between the base function and the output layer, and ∅ j is the nonlinear function of unit j, which is typically a Gaussian function as follows:

Self-Organizing Map (SOM) Neural Network
A self-organizing map (SOM) is one of the artificial intelligence methods first developed by Kohonen [38]. This model displays a regular distribution of large dimensions on a small system. Hence, it can reduce complex nonlinear relations in large data sets to a simple display. While the structure of the original topology of data is maintained, SOMs reduce the size of the data and show similar patterns. This method is done using MATLAB software and is considered one of the artificial intelligence methods. Each SOM network usually consists of an input layer and an output layer. Weight vectors (synapses) connect the input layer to the output layer (called the competitive map or layer). In an iterative process, SOM is trained. Each input vector, based on the maximum similarity, activates a node called the winner cell in the output layer. The similarity between the two vectors is usually based on the Euclidean distance according to Equation (3), which is often measured as a difference [38]: where X i is the ith input vector, W ij is the weight vector connecting the input i to the output neuron j, and D J is the sum of the Euclidean distance between the input sample x i and its weight vector connecting to the jth output cell, called the map unit.

Back Propagation Learning Algorithm
This algorithm is one of the learning algorithms with observer, which basically consists of two main paths: Forward path: where the input vector is applied to the neural network and its effect is propagated to the output layer through the middle layers. In this path, for each input, a value called output is calculated by the network. Network parameters remain constant in this path.
Backward path: after generating output in the forward path, the difference between the desired output (observed) and output calculated by the network is determined. Error signals on the backward path are redistributed from the output layer throughout the network and the network parameters are reset.
The above dual process is repeated many times to bring the network output closer to the desired output. When the error obtained from the permissible threshold is small, the training process is stopped. Error back propagation method, Levenberg-Marquardt algorithm, due to faster convergence in training medium-sized networks, has been selected for use in the present study. The error back propagation algorithm changes the network weights and bias values in such a way that the performance function decreases more rapidly.

Momentum Algorithm
In this algorithm, the law of weight change can be considered in such a way that the weight change in the nth iteration is to some extent dependent on the weight change size in the previous repetition.

Modeling of Artificial Neural Network
The number of input data to the artificial neural network was 225 data, which included Reynolds number, nanoparticle concentration and temperature. Seventy-five data obtained for Nusselt number and energy consumption were considered as the objective function. The neural network is designed using a multilayer perceptron algorithm. The general modeling process for predicting the Nusselt number and energy consumption is represented in Figure 3. As seen, this network has three inputs, a hidden layer with 22 neurons and two outputs. The transfer function is sigmoid and Purelin transfer function is used in the output layer. The parameters of temperature, Reynolds number and nanofluid concentration were considered as inputs to the SOM neural network to investigate and select the winner neuron with the most data. As shown in figure below, a network with 3 inputs and 28 neurons is evaluated (Figure 4). In this study, a two-layered feedforward neural network using error back propagation-Levenberg-Marquardt (BP-LM) and momentum algorithms was employed for modelling the Nusselt number and energy consumption.
Mean square errors (MSE) and coefficient of determination (R) are used to evaluate the results [32].
where N is the number of experimental data, U Exp i is the experimental data assigned to Nusselt number or energy consumption and U ANN i is the Nusselt number or energy consumption predicted by neural network. Moreover, U is the mean value of Nusselt number or energy consumption and e is the mean value of error. Figure 5 represents the output of the SOM artificial neural network with primary neurons and the number of their members. Yellow colour represents successful neuron assigning 15 data to itself and blue colour represents unsuccessful neurons. Each neuron contains a number of data. Each neuron having the most data is the winner neuron [25]. In Figure 5, it can be said that the winner neuron is neuron 22 with 22 data. This neuron can be used in the perceptron neural network. The distance between the center of each neuron relative to the neighboring neuron is shown in Figure 6. The longer the distance between the neurons, the darker their neighborhood line is, and the shorter the distance, the more similar they are to each other and the brighter the lines are. The neurons 25 and 26 are farther apart, but the neurons 5 and 6 are closer to each other and more similar ( Figure 6). In Figure 7a, the range of input temperature data is 70, 80 and 90 K and contains 75 data. As seen, black neurons 1, 2, 8, 9, 21, 26, 27 and 28 have the least excitability, and the neurons 24 and 25, which are yellow, have a higher excitability. The same analysis can be done for Figure 7b,c. The target and simulated data in the program can be plotted and matched so that the simulation error can be clearly seen. This is shown in Figure 8a,b. In these diagrams, the horizontal axis is the number of data and the vertical axis is the network output values (modeling results) and the target input data (the experimental results for Nusselt number and energy consumption). As seen, the network prediction is very close to the experimental data, indicating the high accuracy and success of the program with the topology of 3-22-2. The discrepancy between the model data and the actual data may be due to measurement error or other errors during testing.    .02, respectively. One of the major problems related to artificial neural network instruction is over-fitting or over-training, in which the generated artificial neural network system can only produce a good forecast for a known data set and is unable to provide a reasonable forecast for the new data set. To avoid over-fitting and improve the generalization capacity of the network, the early-stopping technique is used. In this method, all data is randomly divided into three sub-sections, namely instruction, validation and test. The instruction or training is for network training, and the validation or evaluation set is used to ensure the accuracy and generalization of the network developed during the training process. The network training process stops when the evaluation or validation set error increases, even if training continues until the training set error is minimized. When network training is stopped, the test phase is used to check the final performance of the network.   Using the same procedure, the mean square errors are also 0.002335, 0.00338548, 0.0014088, 0.0024226-0.00011405, 0.00021978, 0.00015764 and 0.0001363. The closer the correlations are to 1 and the mean square errors to 0, the better the results of predicting the Nusselt number and energy consumption are. Figure 12a,b show the effect of the number of training cycles on neural network performance. As seen in this figures, the mean square error of the network starts from a large value and gradually decreases [34]. This means that the network learning process is progressive. The evaluation set is used to maintain the network. The network training process continues until the network error about the evaluation network is reduced. According to Figure 12a,b, with increasing the number of cycles to about 2 and 3 cycles, the amount of network training and test error decreases. After that, as the number of cycles increases, the error rate does not change much and remains almost constant. Therefore, in this study, the number of 2 and 3 cycles is considered as the desired number of cycles for the Nusselt number and energy consumption, respectively. According to Figure 12a,b, the values of the best performance for the network with the mean square error for the Nusselt number and energy consumption are 0.0038548 and 0.00021976, respectively. Validation data indicate the ability to predict and generalize the neural network appropriately, in such a way that the output data of the algorithm are completely consistent with the actual output data. Validation against new data increases network error and stops the training process.

Results and Discussion
Just as a small number of neurons in the middle layer can produce undesirable results, an increase in the number of neurons in the middle layer also leads to an increase in error. Because the high number of neurons in this layer causes the network to become too complex and trained, and the small number of neurons will cause the neural network not to be trained and the data to be stored. As shown, the values of R and MSE are only presented for the overall data. It is seen that a structure with 22 neurons in each hidden layer is the best structure to model the mixture conduction (Tables 2 and 3).The radial and perceptron artificial neural network with the algorithms mentioned in Tables 2 and 3, by learning a number of recorded data, has been able to predict the Nusselt number and the amount of energy consumed in the whole training range. The reason for the good performance of the neural network of the present study can be attributed to the intelligent data analysis process, appropriate measurements, use of scattered data, generalizability, having the features of the ability to learn, parallel processing, robustness and selection of effective characteristics in the shell-tube heat exchanger (so that the neural network during the training process, by creating logical relationships between input and output mappings, uses it to calculate data not used in network training). The results of the Levenberg-Marquat algorithm for the radial and perceptron neural network are also much more successful than the momentum algorithm.  Figure 13a,b represent the error distribution for the Nusselt number and energy consumption in terms of the number of samples. As shown, the uniform distribution is observed around zero, with the highest error values for the Nusselt number data in the range of ±0.1 and for the energy consumption data in the range of ±0.02. Examining the performance diagrams for the Nusselt number and energy consumption, the results are desirable for the following reasons: 1. The mean square error values are low. 2. The mean square errors of the training, validation and testing set have similar behavior and characteristics.

Conclusions
In this study, predicting the parameters of Nusselt number and energy consumption in the processing of tomato paste using perceptron artificial neural network was evaluated. The input parameters included Reynolds number, temperature and concentration of aluminum oxide nanofluid, and Nusselt number and energy consumption were output or target data. The hidden layer had 22 neurons with sigmoid transfer function and output layer had Purelin function. According to the results, the neural network with a topology of 3-22-2 had high accuracy, so that the network evaluation indicators such as correlation coefficient and mean square error indicated desirable values and the predicted values for Nusselt number and energy consumption were in good agreement with the experimental results.
Statistical analysis of regression through correlation coefficients and mean squared error showed that the multilayer perceptron network with Levenberg-Marquardt algorithm had better conditions for Nusselt number and energy consumption modeling than the neural network with radial function and momentum algorithm.
The smaller the discrepancy between the output data from the modeling and the laboratory data, the higher the efficiency of the algorithm used in predicting the output.
Validation data show the ability to appropriately and favorably predict and generalize the neural network.
Validation against new data causing increased network error resists and stops the training process.

Conflicts of Interest:
There is no conflict of interest.