1. Introduction
Energy efficiency and sustainability are some of the main concerns of modern societies. Energy plays a crucial role in the development of countries and companies, therefore securing energy supply is crucial. Geopolitical factors such as conflicts between countries, socio-economic interests, or adverse environmental events highlight the scarcity of this resource and the need for energy optimization plans [
1]. Industry has an important role in this context, since this sector consumes around 37% of the world’s total delivered energy [
2].
In recent years, industries have placed significant emphasis on energy optimization, in particular the refrigeration industry, which consumes 20% of the world’s total energy [
3]. Historically, industrial systems were designed from a functional perspective, not taking into account factors such as energy efficiency or environmental impact. Early automated chiller control systems were based on setpoint-based air/water temperature control. The control was carried out by switching the system’s compressor on/off. Over time, more advanced control systems were designed to increase productivity and reduce energy consumption. Two factors drove this improvement: firstly, new techniques for building more efficient components and machines such as electronic valves or high-performance compressors. On the other hand, advances in process control techniques have improved the efficiency of refrigeration systems. New modeling and optimization methods have made it possible to characterize systems to facilitate their design and control.
The modeling of a refrigeration system is mainly based on solving a mathematical problem. The objective is to find the optimal solution by reducing the error of a cost function. Modeling problems in refrigeration systems focus on the optimization of energetic parameters, the most relevant of which are the coefficient of performance (COP) and the energy efficiency ratio (EER).
One of the main challenges in the study of refrigeration systems is the modeling of the system. The elements that make up a cooling system have a non-linear response and are highly coupled. Moreover, these systems are multi-parametric and obey complex thermodynamic laws [
4]. Therefore, the modeling of these systems has been carried out mainly with mathematical methods or statistical approaches. However, analytical models are difficult to integrate into real production environments. The models are highly dependent on the system under study, as well as on the assumptions and preconditions made when solving the problem. This implies that most analytical models cannot be applied to real production systems because the designs are not able to respond to real environmental conditions. In production, control systems use simplified mathematical models and PID (proportional, integral, and derivative) controllers. However, the design of such control systems is complex and requires considerable knowledge and experience.
The challenges involved in the design of refrigeration systems has led to the search for more effective modeling techniques to satisfy the needs of the industry. On the other hand, several industries [
5]—in particular the refrigeration sector [
6]—have taken advantage of new techniques of data acquisition to apply artificial intelligence (AI) algorithms for system modeling. Experience-based AI models are able to characterize a system based on the prior knowledge we have about it. Some AI techniques are based on statistical models that can take into account properties such as physical laws or the dynamics of the system under certain conditions; others are based on rules that describes the behavior of a system based on the knowledge of an expert. In short, AI techniques are based on the prior knowledge we have about a system, whether derived from mathematical or empirical data. Over the years, different AI techniques have been used to model and optimize refrigeration systems. Among the most relevant methods are Fuzzy Logic [
7], heuristic algorithms as Genetic Algorithms (GA), or expert systems. Research such as [
8,
9] use Fuzzy Logic rules to optimize the control of a refrigeration system, increasing the robustness and response under different working conditions. In studies [
10,
11,
12], researchers focus on the reduction in energy consumption in air conditioning systems. By using Fuzzy Logic techniques, they are able to efficiently control the temperature of the refrigeration chamber by acting on different elements of the system such as valves or fans. In [
11], the authors reduce energy consumption by 30% per day by applying these techniques in the control system.
One of the most widespread AI algorithms for optimization problems in refrigeration systems has been Genetic Algorithms [
13]. In the literature, we can find a wide variety of studies about the applications of GA for optimizing the energy consumption in refrigeration systems. In [
14], using a GA model, the authors reduce the energy consumption of a vapor compression refrigeration system by up to 8%. The model focuses on the optimization of the system operating costs by controlling the interaction between components, the production environment conditions, and the cooling load in the system. On the other hand, the work presented in [
15] proposes a GA based control strategy for an air conditioning system to reduce the energy costs. By monitoring variables such as power consumption, thermal comfort, chamber air quality, relative humidity of the air, and ventilation flow level; the model increases the efficiency of the system by reducing the overall energy consumption. Besides GA, other variants of stochastic algorithms have been used in the refrigeration industry. Among them, we can find Simulated Annealing [
16] or Particle Swarm Optimization [
17].
In recent years, knowledge-based techniques have been replaced by data-driven Machine Learning (ML) models in which neural networks stand out. Data-driven models are built by correlating the input and output data of a system. These techniques are particularly useful for modeling complex non-linear systems without having much information or experience about them. In recent years, ANNs have become one of the main modeling techniques for refrigeration systems. Neural networks, in addition to offering a simple and efficient modeling technique, have demonstrated better performance than traditional modeling techniques in the refrigeration industry [
18,
19,
20].
The objective of this study is to review the application of neural networks in the field of industrial refrigeration. Throughout the paper, it is shown how different types of neural networks can be applied for energy optimization, system modeling, process control, or fault diagnosis in refrigeration systems.
This study is structured as follows:
Section 2 sets out the context and the different factors that have led the use of neural networks in industry and, in particular, in refrigeration systems;
Section 3 reviews the different applications of neural networks in refrigeration systems, how the models have been applied, and which ones are more suitable for certain tasks.
Section 4 discusses the application of ANNs in refrigeration systems, as well as the challenges and new trends of future refrigeration systems. Lastly,
Section 5 shows a summary of the work and the final conclusions.
2. Paradigm Shift—Industry 4.0
In recent years, the industrial sector has undergone a profound change of paradigm due to the adoption of information and communication technologies in the industrial processes. Technologies such as IoT, AI, Big Data, or Cloud Computing shape the concept of Industry 4.0 or fourth industrial revolution [
21]. This concept is founded by three principles which are the interconnection of devices, the massive data acquisition and the processing and management of the information for making decisions.
New wireless communication protocols with higher performance and low power consumption and improvements in electronic devices have enabled a major sensorization in the industry [
22]. IoT concept consists of Machine-To-Machine (M2M) data networks, in which smart devices can communicate with each other being able to transmit information and make autonomous decisions. In recent years, the use of IoT technologies have intensified in industry [
23] since the access to new real-time process data is improving the productivity, efficiency, and quality of factories.
For years, industries such as refrigeration have been storing large amounts of information from process monitoring systems. Today, the volume of process information has multiplied exponentially due to sophisticated monitoring systems and IoT. This large amount of data is an important part of the know-how of companies. However, processing such volumes of information is a labor-intensive task. As the quantity and quality of data increases, new methods of data processing and control are needed. ML algorithms, and in particular ANNs, have the ability to extract features from large amounts of data and use this knowledge to make autonomous decision without program each particular scenario. In this way, ANN models give machines the ability to learn to extract complex data patterns, analyze trends, draw conclusions and act, just as an expert would. One of the main potentials of ANNs is the ability to extract high-value information from uncorrelated data. These models are capable of detecting patterns and trends in datasets that would be impossible to find through fundamental analysis. Therefore, neural networks algorithms are leading a paradigm shift in industry, transforming traditional machines into smart and self-learning devices.
3. Artificial Neural Networks in Industrial Refrigeration Systems
The refrigeration sector is involved in Industry 4.0. Refrigeration systems are becoming more and more sensorized and, consequently, the ability to control them increases as our knowledge of them grows. Data-driven AI algorithms have the capacity to model a system by using exclusively input and output data of the system. This set of techniques that learns through experience are known as ML algorithms; and among them, artificial neural networks stand out.
3.1. Artificial Neural Networks
ANNs are machine learning models inspired by biological neural networks. These computational models are specifically designed to recognize patterns and extract relationships from data. Neural networks are composed of subsystems called neurons,
Figure 1. Each neuron has an activation function that relates the neuron’s inputs to its output.
A neuron can be described with a transfer function according to the Equation (1), where
f is the activation function of the neuron,
w is the vector of weights,
x is the vector of inputs, and
b represents the bias.
An ANN is made up of several interconnected neurons. These are organized in an input layer and an intermediate layer with one or more neurons that connect the inputs to the outputs of the network. The connections between neurons consist of weights that ponder the output of one neuron with the input of the neuron of the next layer.
Like other ML algorithms, neural networks require a training process. In this phase, the network is able to model a system from an input/output dataset. The training consists of iteratively calculating the network parameters. The most widely used training method is the backpropagation algorithm. It is a gradient descent algorithm that is highly effective and computationally efficient [
24]. It calculates the value of each optimal weight and bias of the network to reduce the error of a cost function. The algorithm is repeated until convergence is reached. It works as follows: firstly, it calculates the output of the network along the different layers (forward pass); second, the algorithm calculates the error of the prediction using a loss function, typically mean squared error (MSE), which compares the desired output with the real output of the network. The contribution of the individual outputs of each neuron in the network is then calculated by propagating the gradient error backwards. Finally, the algorithm performs a gradient descent step to adjust all connection weights in the network using the error gradients calculated earlier.
In order to apply the gradient descent algorithm successfully, it is necessary that the activation functions of the neurons are derivable. Moreover, these functions must be non-linear to be able to model and extract complex patterns from unrelated data. The most commonly used activation functions in artificial neurons are Rectifier Linear Unit (ReLU), Hyperbolic Tangent (Tanh), and Softmax.
In addition to variables such as weights, bias, or number of neurons, ANNs have other configuration hyper-parameters that influence the performance of the model. The main hyper-parameters are summarized in
Table 1.
In practice, there is no deterministic method for estimating the size of a neural network. Small ANNs have a smaller number of parameters; therefore, they are more likely to suffer under-fitting as they do not have the ability to learn complex structures of the data. However, while larger networks can extract more complex features from data, they can lead to over-fitting. Likewise, there is no exact formula for the calculation of the hyper-parameters of the network. It is necessary to find a compromise between performance, time consumption, and error. This task is usually performed experimentally by trial and error [
27].
Due to the great ability of neural networks for modeling non-linear and multi-parametric systems, their use in refrigeration systems has increased significantly in recent years [
28] (see
Figure 2).
Improvements of ANNs include less time and effort in modeling thermodynamic systems and the ability to learn from examples. For this reason, ANNs have been used in applications such as refrigeration system modeling for energy saving, extraction of refrigerant properties, or fault diagnosis in refrigeration systems.
3.1.1. Refrigeration System Modeling Using ANNs
The objective of modeling a refrigeration system is to know how the system behaves under certain operating conditions. Models are useful for determining the optimum operating point of a cooling system. ANN modeling consists of training a neural network using a dataset with information from the system. The dataset is made up of input variables that are related to one or more output variables,
Table 2. By using the proper number of training samples, the neural network will be able to learn the dynamics of the system and predict how the system will behave even under conditions never seen before.
The main application of ANN modeling in refrigeration systems has been the optimization of energy consumption, where 70% of the articles analyzed were related to the energy performance of refrigeration systems [
40,
45].
In [
29], the authors model a refrigeration system with an evaporative condenser using ANNs. Their main objective is to predict the power consumption of the system by using information such as the evaporator load, air and mass flow rate, and air-dry and wet bulb temperatures in the condenser. With this information, they predict the power consumption of the compressor. With a neural network with only four neurons and a dataset with 60 samples, they obtain a correlation coefficient higher than 0.9 for all model predictions.
Most of the studies related to the energy optimization of refrigeration systems use COP as a target variable as can be seen in [
31,
37]. In the same way, the analysis of the energy performance of an experimental refrigeration system is addressed in [
34] by means of COP analysis. In this study, the authors generate a large amount of data by modifying variables such as compressor rotation speed and the temperature and volumetric flow of the secondary fluids of the net. After training the ANN, the authors are capable of knowing under which operating conditions their system is more efficient.
More complex systems—such as cascade configurations, variable speed refrigeration systems, or ejector compressor systems—have been modeled using ANNs. By analyzing the evaporator load and the water flow rate, the model proposed in [
46] is able to estimate the compressor power consumption and COP of a cascade refrigeration system. The authors analyze how the COP of the system is affected by certain events such as an increase in load or evaporator temperature. On the other hand, the work presented in [
32] researches how the operating frequency of a variable speed refrigeration system affects the COP using a neural network. Finally, the presented model obtains low prediction error rates using a dataset with a limited number of samples. Other configurations, such as ejector refrigeration systems, have been analyzed with ANNs. These systems have certain advantages over classical compressor refrigeration systems. For example, ejector systems are simpler, cost less, and require little maintenance. However, these systems have a lower coefficient of performance. Thus, the optimization of these systems is essential for their correct design and operation. In [
47], the authors design an ANN for the modeling of an ejector refrigeration system. For this purpose, based on different temperature points and the generator pressure, the COP of the system is estimated with high accuracy.
On the other hand, studies such as [
33,
48] have modeled different cooling systems using a hybrid strategy called ANFIS (Adaptive Neuro-Fuzzy Inference System). This technique consists of the combination of two AI techniques, ANNs and Fuzzy Logic. In [
48], it is compared the performance of a simple ANN with an ANFIS system and, although both techniques achieve excellent results, ANFIS outperforms ANNs. The authors explain that this improvement is due to the fact that an ANFIS network combines the learning capabilities of an ANN with the reasoning capabilities of Fuzzy Logic.
Most authors agree that the proposed ANN models, in addition to serving as a design tool, can be integrated into controllers to reduce the energy consumption of systems. In [
44], the authors use an ANN to control the response of a simple compression refrigeration system. By monitoring the temperature and pressure at the outlet of the evaporator, a model has been designed to control the opening of the expansion valve of the system. The model was integrated into the control system which acts on the expansion valve to maintain the system at its optimum operating point. The results of the study show that ANN-based controllers perform as well as PI-based or predictive functional controllers.
Other control systems based on ANNs have focused on acting on the compressor operating cycle to optimize the power consumption of the compressor in refrigeration systems. In [
41], by means an ANN model, the authors predict the duty cycle of the compressor. First, the model is used to compute the optimum hysteresis temperature and power consumption under different operating conditions. Then, the results are integrated into the control system to determine the optimal compressor duty cycle. The proposed control system achieves energy savings of up to 13.4%, depending on factors such as ambient temperature or the stored material in the chamber.
3.1.2. Refrigerant Properties Modeling Using ANNs
One of the most important applications of ANNs in refrigeration systems is the analysis of refrigerant properties. It is well known that the type of refrigerant used in a refrigerant circuit has a direct impact on the performance of the refrigeration system. The thermodynamic analysis of a refrigerant is complex because it involves solving differential equations. In order to simplify this task, ANNs have been used for the characterization and evaluation of the thermodynamic properties of refrigerants in refrigeration systems.
In studies such as [
49], the thermodynamic properties of an alternative refrigerant, R407c, have been analyzed by using ANNs. The article analyzes two regions of the refrigerant: saturated liquid-vapor and superheat vapor. Using the temperatures and pressures of the different states of the refrigerant, the authors design a model capable of predicting the enthalpy and entropy of the refrigerant. The model performs accurate predictions with R
2 results close to 1.
In [
36], the energy performance of various refrigerants (R134a, R450A, and R513A) in a vapor compression refrigeration system are analyzed. To do this, the authors employ an ANN with one hidden layer. With only two input variables, evaporator and condenser temperatures, they are able to predict the cooling capacity, power consumption, and COP of the system. After training and evaluating the model, they conclude that the refrigerants R134a and R513A presented a similar energy performance, while R450A had about 10% less power consumption than the other two refrigerants. In contrast, refrigerant R450A presented a slightly lower cooling capacity than R134a. Similarly, in [
30], the authors model the refrigerant R1234yf by means of an ANN. The work is focused on the analysis of the energy performance of the refrigerant. For this purpose, the COP is calculated using variables such as compressor rotation speed and temperatures of the secondary fluids at the evaporator and condenser inlet. With the model, the authors build energetic maps to identify the optimal operating zones of the refrigeration system.
3.1.3. Fault Diagnosis in Refrigeration Systems Using ANNs
Failures in cooling systems increase energy consumption and decrease efficiency of chillers. In addition, continuous episodes of malfunctions over time can lead to system breakdowns, resulting in elevated operating costs. Due to the importance of early fault detection and diagnosis in industrial systems, fault analysis has been a broad field of research.
Several studies have used ANNs to predict failures in refrigeration systems. The authors in [
42] design a neural network model that predicts the energy consumption of a compressor with errors of less than 5%. They also predict other variables such as cooling capacity or chilled water outlet temperature with high accuracy. The authors propose a failure detection model based on the comparison of the predicted variables with their actual value. In this way, they are able to detect deviations in the normal operation of the system and trigger fault alarms.
On the other hand, studies such as [
50] go beyond by proposing a model capable of predicting up to eight types of system failures. The authors present an ANN model for diagnosis the faults of a vapor compression refrigeration system. First, they train a neural network with a dataset that relates different types of system failures with various pressure and temperature points in the system. Then, the ANN acts as a classifier capable of predicting a certain type of system failure through the measured variables.
Fault detection tasks are complex as they depend on a large number of parameters. In addition, common measurement variables and system failures are poorly correlated. Often, it is necessary to use advanced feature extraction techniques to be able to accurately model fault prediction systems. For this reason, most researchers prefer to use more advanced ANN configurations such as deep learning algorithms.
3.2. Deep Neural Networks
Deep neural network (DNN) models are defined as artificial neural networks composed of several hidden layers (see
Figure 3). DNNs have the ability to solve more advanced problems due to their deep feature extraction capabilities. These networks are widely used in image processing and speech recognition tasks [
26].
Like other neural network algorithms, DNNs learn from examples; therefore, they need a previous training process. However, deep models are harder to train due to the complexity of their network structure. A higher number of hidden layers implies a higher number of parameters and a higher dimensionality of the network. DNNs need a larger amount of training data, otherwise the model may suffer from overfitting. This phenomenon occurs when the network extracts features from the training dataset, but is not able to generalize the problem properly. When the volume of data is limited, traditional methods tend to perform better than deep learning ones. However, with sufficient data, DNNs outperform traditional ML methods in different domains such as computer vision, speech recognition, natural language processing, and time-series forecasting [
52].
With the increasing volume of data in the industry and the design of more advanced cooling systems, researchers are looking for more sophisticated modeling and control techniques. Several studies have been carried out demonstrating the advantages of DNNs for system modeling, fault detection, and energy efficiency in the refrigeration industry.
3.2.1. Refrigeration System Modeling Using DNNs
As with ANNs, the use of DNNs in cooling systems has been focused on energy efficiency. Several studies have shown that this kind of network can outperform classical ML models and ANNs. One example is study [
53], where the authors design and compare two AI models to evaluate the performance of the system and create maintenance schedules in a water chiller. The compared models are: Linear Regression, a classical ML model; and a DNN. Based on four water temperature points and two water velocity measurements in the condenser and evaporator, the authors predict the compressor power consumption. The best results are obtained with the DNN with R
2 equals to 0.97. The linear regression model, however, performs poorly because the algorithm is not able to capture the non-linearity of the system.
On the other hand, in [
54], a model has been designed to predict the energy consumption of a variable refrigerant flow system. First of all, the authors conduct a study to determine which input variables have the greatest influence in the energy consumption of the system. After the analysis, the selected variables are outdoor temperature, indoor temperature, and cooling load. Various network topologies are analyzed in the study, from a simple ANN to DNNs. The best results were obtained using a two-layer hidden DNN with 15 neurons in each. The model proposed predict accurately the energy consumption of the system under study. Similarly, in [
55], the authors compare the performance of an ANN with various DNN configurations to model a refrigeration system. Using different chilled and condenser water temperatures, they predict the power consumption and COP of the system. In the study, the authors compare different parameterizations and network configurations for training the model. First, they compare two of the most commonly used activation functions in neural networks, Sigmoid and ReLU. On the other hand, they vary the number of hidden layers of the model from one hidden layer, an ANN, to five-layer DNN configurations. The best performance is obtained with a four-layer deep net and ReLU activation function. Finally, the model is used for control the temperature in a real refrigeration system. The model achieves energy savings of up to 24.7% in the chiller and 7.4% in the overall system compared to a conventional control system. In the same way, other DNN-based chiller control systems have been successfully implemented, achieving energy savings of up to 17% [
56].
Among the studies using DNNs to model refrigeration systems, a significant increase in the number of input variables has been detected. While the average number of inputs in the ANN studies analyzed was around three; studies such as [
57] use up to 11 input variables for training the DNN models.
3.2.2. Fault Diagnosis in Refrigeration System Using DNNs
Fault diagnosis is one of the main applications of DNNs in refrigeration systems. Fault detection tasks require the analysis of a large number of system variables for a long time. The ability to process large datasets and multidimensional data makes DNNs suitable for model failures in industrial refrigeration systems.
A representative example can be found in [
58], where a DNN is trained using a dataset of 12,000 samples and more than 64 input variables from a refrigeration system. All data are labeled with seven common types of system failures. The neural network acts as a systems fault classifier. The model receives operational data in real time and is able to predict whether the system suffers any of the following failures: reduced evaporator water flow, condenser fouling, reduced condenser water flow, non-condensable in refrigerant, refrigerant leak/undercharge, refrigerant overcharge, or excess oil. After evaluation, the model achieves 98% accuracy in predicting and classifying failures.
On the other hand, in [
59], the authors carry out a comparative study of different data-driven fault diagnosis methods for a variable refrigerant flow system. The following models are compared: Decision Trees, Support Vector Machines (SVM), Clustering, ANN, and DNN. For training, two datasets of 31,800 and 15,900 samples are used. Both contain information to predict six types of system fails as compressor liquid floodback or indoor/outdoor unit fouling faults. After comparing the models, the best performance was achieved by the DNN model followed by the ANN and SVM. However, all models accurately predict all categories of failures. The authors attribute this to the fact that all models have been trained with a large quantity of data, in addition to all fault categories being well balanced in the dataset.
From the review of DNN articles, we can draw the following conclusions. First, the number of DNN input data has increased significantly in fault diagnosis applications. While the papers analyzed for energy performance modeling use datasets with hundreds or several thousands of data; fault diagnosis studies use several tens of thousands of data for training the models. It also highlights the difference in the number of neurons used in the hidden layers of the networks. While modeling studies such as [
53,
57] use 10 neurons per layer; in studies such as [
58,
60], the authors use 64 and 200 neurons respectively for fault detection. In [
60], the authors reaffirm that the use of deeper networks can improve model results. After analyzing different topologies, a network with five hidden layers was chosen. It predicts five chiller failure types with an accuracy of 95%.
3.3. Convolutional Neural Networks
A convolutional neural network (CNN) is a particular type of DNN inspired by the visual cortex of the human brain for object recognition [
24]. CNNs are specifically designed to process spatial data and are therefore widely used in image recognition and computer vision tasks.
CNNs are inspired in multilayer neural models; however, CNNs use convolutional filters instead of artificial neurons. These filters process the input data through the convolution operation. Convolutional filters extract spatial features from the data to improve the classification and prediction properties of the network. As with other neural network models, CNNs require a training process to adjust the parameters of the filters.
CNNs are composed of several layers; the first layer extracts the high-level features while the later layers focus on the detail [
52]. The result of these networks is the creation of feature maps from multidimensional input data [
24]. In
Figure 4, the basic structure of a CNN is shown.
A convolutional network can be summarized in three layers:
- 0
Convolutional layer: It is composed of convolutional filters. Its main function is the extraction of spatial features from the input signals. Convolution operations produce new signals that may reveal more information about the input than the original signal itself.
- 1
Pooling layer: Also called the sub-sampling layer. The main function of this layer is to reduce the dimensionality of the signals of the previous layers. The pooling layer has a beneficial effect on network performance by reducing overfitting and training time. It also reduces the noise of the input data.
- 2
Fully connected layer: This layer is often used at the end of a CNN. It contains artificial neurons that calculate the output of the network based on the output of the previous layers.
CNNs are designed to work with 2D and 3D multidimensional data. They have been widely used in image recognition, artificial vision, and text processing. However, the ability to handle large amounts of low-correlated data and the efficient feature extraction from multidimensional data; make CNNs suitable for the modeling of industrial systems.
3.3.1. Refrigeration System Modeling Using CNNs
Most of the data collected in industrial systems are time series. Typically, different variables are monitored simultaneously in the systems. The data obtained can be interpreted as ‘images’ of characteristics composed of sensor data and time. These pictures contain information relating different operating states of the system in the past and future. Then, CNNs can be used to extract features and model the performance of industrial systems. CNNs have been widely used to capture spatial correlation in unstructured data of refrigeration systems. An example of this technique can be found in [
61], where the authors use CNNs to predict the optimal refrigerant charge of a refrigeration system. The refrigerant charge is closely related to the COP of the system. The objective of the proposed model is to predict refrigerant losses or excess of refrigerant charge in order to optimize the energy performance of the system. For this task, a dataset of 34,000 samples with data from 28 sensors is used. The time series of the dataset is transformed into feature images that are used to train and test the convolutional network. The CNN is compared with an ANN. The CNN achieved the best performance with 99.9% accuracy compared to 84.4% achieved by the ANN.
Other studies have used CNNs to model cooling systems from an energy point of view. In [
51], the authors design two CNNs for predicting the energy performance of a heating and cooling system. The first model predicts the energy consumption of the system using seven temperature points and the water flow of the electric heat pump; in the second, the COP is modeled using the same parameters as above plus the load of the electric heat pump. Both models obtain a coefficient of determination greater than 0.95. In the training phase, authors perform several experiments with different data sampling rates. They generate four datasets sampled at different time intervals (3 min, 15 min, 30 min, and 1 h). They conclude that data sampling influences network performance. The lower the temporal resolution of the dataset, the worse the results of the neural networks for their case study. In [
62], after modeling the energy performance of a refrigeration system, the authors conclude that CNNs achieve better performance that other deep learning models.
On the other hand, in [
63], the authors use CNNs to model the different operating states of a screw compressor in a chiller. The control of these compressors has some added difficulties because it depends on the status of several valves and actuators of the system. For the modeling, the authors use the compressor temperature and pressure variables, as well as the current consumption of the system. First of all, they group the data in slots of 30 time-steps. Then, each group is labeled with the corresponding compressor operating mode (0—Off-Ready; 1—Holding; 2—Loading; 3—Unloading). The network is capable of predicting the compressor status in real time, verifying its correct operation. Finally, the model is used in a control system where the actual operating state of the system is compared to the predicted operating state. This way, anomalies in the system such as electrical faults or loading/unloading solenoids in the compressor can be detected.
3.3.2. Fault Diagnosis in Refrigeration System Using CNNs
Although CNNs are being widely used for modeling, the main application of convolutional networks in refrigeration systems is the fault detection and diagnosis [
64]. CNN models, in addition to identifying the presence of faults, are able to predict what type of fault is occurring in the system (see
Figure 5).
In the literature, we can find a wide variety of articles in which convolutional networks are used to detect malfunctioning episodes in refrigeration systems as in [
66]. In this paper, the authors design a 1D-CNN for fault detection in early stages in a heat pump system. For this purpose, they use a dataset with 14 input variables, including temperatures, pressures, electrical parameters, and system control variables. The article focuses on the detection of gradual failures that correspond to small deviations from the optimal operating point of the system. Its dataset contains information of eight types of failures such as evaporator/condenser fouling or refrigerant leakage. The final model has an accuracy greater than 90% and overcomes a conventional ANN trained for the same application.
In [
67], a hybrid model consisting of a CNN and a Gated Recurrent Unit (GRU) network is used for fault diagnosis in a chiller. First of all, the CNN is used to extract the local abstract features of the input data; then, the GRU network captures the global features and the dynamics of the time series. The authors use the Grid Search method to optimize the network parameters. They point out that the epochs and batch size are two important parameters for fault diagnostics models because both parameters determine the space of characteristics of the different working conditions. The hybrid model was compared with a simple GRU network, a Long-Short Term Memory (LSTM) network, and a CNN; however, the proposed model reached the best performance with accuracies above 90% in all categories and recall close to 1.
CNNs have been used in conjunction with LSTM networks to enhance the capabilities of both networks separately. Hybrid CNN–LSTM models have been widely used in the industrial sector for time series forecasting such as [
68,
69]. In the refrigeration industry, studies such as [
70] have used this kind of model for fault diagnostic. For this task, the authors have used the public ASHRAE database. Firstly, they carry out data processing using an encoder-decoder with a LSTM network. Afterwards, they design a fault classifier using a CNN. In the study, the authors analyze how the number of variables used for training affects the performance of the network. After varying the number of input variables from 6 to 16, they conclude that increasing the number of variables improves the predictions of the network. Using 16 input variables, the model is able to detect and classify several system failures with an accuracy of 99%. As in the previous article, studies such as [
71,
72,
73] have used the ASHRAE RP-1043 database to design fault detection systems using CNNs. ASHRAE is a research project based on the development and evaluation of fault detection and diagnosis methods applied to chillers. This project provides researchers with a large amount of data from refrigeration systems with different failure episodes.
Alternative variables have been used for fault detection in refrigeration systems. In [
74], authors use audio signals to predict failures in an industrial refrigeration system. In the study, the sound signals recorded in the refrigeration system are converted into images by using the Fast Fourier Transform. Next, the authors use the data to train a CNN model with 15 hidden layers to classify images with or without failures. On the other hand, in [
75], the researchers use vibration signals from a compressor to diagnose five types of faults in a refrigeration system. Vibration signals are obtained with an accelerometer installed in the surface of the compressor. Then the data are transformed into time–frequency images. These images are used to train a CNN to classify input data by type of failure. The model reaches an accuracy above 95% in classifying system faults. The authors conclude that the proposed time-frequency image fusion technique gives better results than using the vibration signal directly to train the model.
3.4. Recurrent Neural Networks
A recurrent neural network (RNN) is a type of DNN designed to process temporal information. RNNs have the ability to model sequential data, being able to remember past information to predict future behavior [
24]. RNNs are widely used in text processing tasks, audio, or spelling recognition.
While traditional neural networks assume that inputs and outputs are independent of each other, the output of a RNN depends on the previous data. The main difference between RNNs and DNNs is that recurrent networks have backward connections. This means that a recurrent neuron receives the current input plus its output at the previous instant. Then, the output of a recurrent neuron is a function of its previous inputs. For this reason, recurrent neurons are also called memory cells. This particular flow of information allows recurrent networks to remember past events.
The training of RNNs has some differences compared to conventional ANNs. Due to the recurrence of the model, for the training, it is necessary to unroll the network over time and then apply the backpropagation algorithm. This technique is known as backpropagation through time (BPTT) [
76]. RNNs have some difficulties in the training phase. First of all, these models can be unstable during training because the networks can suffer vanishing or explosion gradient problem [
77]. The gradient descent algorithm is more likely to diverge when applied to long sequences, causing the model to be unable to extract relationships between temporal datapoints. On the other hand, these networks are not able to remember entries about 10 time-steps earlier; thus, long-term memory of RNNs is truncated.
In order to address the training and memory problems of RNNs, a variation of the model called LSTM [
78] was introduced. These networks are more robust and do not have the training problem that classical RNNs suffer. In addition, LSTM networks have the ability to detect long-term dependencies in the data. This is achieved through a modification in the architecture of traditional RNNs. LSTM neurons have internal states called input, forget, and output gates that act as a memory. An LSTM cell can learn to recognize an important input (that is the role of the input gate), store it in the long-term state, preserve it for as long as it is needed (that is the role of the forget gate), and extract it whenever it is needed.
LSTM networks interpret the inputs as ordered sequences of data. For the training, the time series are crossed with a sliding window to group the samples into ordered time intervals, as shown in
Figure 6. In the new dataset, each time interval is assigned to future samples. The size of the vectors will depend on the number of steps ahead we want to predict with the model.
As introduced in the previous chapter, most industrial process data are time series. Knowing future events in industrial systems is key to schedule control and maintenance strategies. For this reason, LSTM networks are increasingly being used for modeling industrial systems. The following sections show how the LSTM networks have been applied for time series forecasting in refrigeration systems.
3.4.1. Refrigeration System Modeling Using LSTM Networks
Although studies such as [
80] have applied classical RNNs for the modeling of refrigeration systems, most authors use LSTM networks. The use of LSTM networks in refrigeration systems is mainly focused on energy analysis of the systems by modeling energy consumption [
73,
81] or COP [
82].
In [
81], a variable refrigerant flow system is modeled using LSTM networks. The study focuses on the prediction of the energy load. The proposed model is based on multi-step ahead electric consumption forecast based on time-series data using 18 process variables. The authors highlight the importance of data pre-processing for training LSTM networks. In order to increase the performance of the network, instead of increasing the number of neurons and/or hidden layers, they carry out different feature selection tasks as Pearson Correlation and Random Forest techniques to identify relevant and redundant features. This way, they are able to improve the results of the model.
On the other hand, in [
83], the authors compare the performance of a LSTM network with other artificial neural models such as ANNs, DNNs, or 2D-CNNs for chiller efficiency monitoring. Two systems of different sizes are studied, a large water-cooled chiller (WCC) and a small air-cooled chiller (ACC). Both models predict the cooling production and COP. For the WCC system, LSTM and CNN models offer the best performance. In both cases, the error is below 2%. In the ACC system, the LSTM method provides the lowest error, closely followed by the CNN model. In [
53], the performance of different DNN models (MLP, CNN, and LSTM) is also compared to predict the power consumption of the compressor. For each model, the authors use 18,000 time-series data samples and four hidden layers to train the networks. For this case study, the LSTM network had the best performance followed by the CNN.
In [
84], the authors design a model to predict the indoor temperature in a building. The main objective is to forecast the temperature one time step and multiple time steps ahead. For this purpose, different AI models—such as SVM, Decision Trees, ANN, and LSTM—are evaluated. For one-step ahead all models show similar performance; however, LSTM network outperforms the other models for multi-step forecasting.
Studies such as [
85] predict the temperature and power demand in a cold storage system. By using different environment variables such as air temperature of the room, the compressor state (on/off) or the power consumption; the authors predict the temperature of the cold store and the compressor power consumption. For the study, they use different LSTM networks such as a traditional LSTM, a bidirectional LSTM, or a convolutional LSTM model. The following conclusions are reached; first, the temperature prediction shows lower performance than power demand forecasting. The authors attribute this phenomenon to the fact that temperature variations depend on more complex features. On the other hand, the highest performance is obtained with the traditional LSTM model. This is because bidirectional and convolutional LSTM models are more sensitive to noise, so that their performance is better with less noisy data. It was also found that the use of more sensors for training leads to better predictions.
LSTM networks achieve outstanding results when used together with other prediction or optimization models. In [
82], the authors state that the use of an encoder–decoder in conjunction with a LSTM network has improved the COP prediction in a water-cooler chiller. On the other hand, several studies have shown that the combination of CNNs and LSTM networks improve the performance of both models used separately. In [
86], the authors have applied a hybrid CNN–LSTM model for cooling load forecasting in a refrigeration system. The CNN learns spatial information in the data, while the LSTM network extracts temporal features and makes predictions. The proposed model improves the results of other techniques such as SVM, DNNs, RNNs, or LSTM networks.
3.4.2. Fault Diagnosis in Refrigeration Systems Using LSTM Networks
Due to the multi-step time series forecasting capabilities, LSTM networks are being widely used for fault detection in industrial refrigeration systems. This task is carried out by comparing the value predicted by the model with the actual value of a certain system variable, such as temperature or refrigerant charge. The deviation between these two magnitudes can be used as an indicator of certain types of failures in refrigeration systems.
In [
87], an LSTM model has been used to design a fault detection system for a chiller. The model focuses on predicting the suction temperature and suction pressure of the compressor. For training the network, data from 11 sensors have been used. The dataset, composed of 45,000 samples, includes the two target variables—condenser-air temperature and refrigerant temperature—among others. The authors have evaluated other AI models such as Linear Regression and ANNs; however, the best results are achieved with the LSTM network. Finally, the model is applied to diagnose failures in the refrigeration system. For that, the authors analyze the deviations between the actual sensor values and the model predictions to create fault alarms in the system.
In [
88], the authors use a two-layer LSTM neural network to predict five typical failures in a heating ventilation and air-conditioning system (HVAC). The model is composed of two subsystems. Firstly, fault detection task is carried out by a binary classifier. If a fault has been identified, the sample is further analyzed in a second phase where the failures are classified. The authors use the ASHRAE Project 1043 dataset. Before training the model, a data pre-processing is carried out in order to simplify the initial dataset. This task is performed by the RelieF algorithm, which identifies the 10 most relevant variables in the dataset. In addition, during the training of the model, different network parameters are tested such as the number of epochs, number of hidden layers (from 1 to 4), number of neurons per layer and learning rate. They stress the importance of carrying out feature engineering due to the large number of samples in the dataset, since feature selection reduces the risk of overfitting as well as improve the network speed convergence. Finally, the model is compared with a traditional RNN and a GRU network; however, the LSTM network achieves the best results.
Studies such as [
89] have proposed improvements in LSTM models for fault detection in refrigeration systems. In this paper, the authors have designed a model called AE-LSTM (Adaptative-Enhanced LSTM). The main feature of the model is the clustering of data by sensor groups in order to improve the predictive capability of the network. After training, the AE-LSTM model provides faster and more significant fault diagnostic performance than a traditional LSTM network. This is because the AE module can adaptively vary error of backpropagation algorithm and reduce the computational time by adapting the learning rate.
New types of variables, such as sound or vibration, are being monitored in refrigeration systems in order to diagnose failures. In [
90], the authors have designed an LSTM model where, through the analysis of the acoustic signal emitted by a reciprocating compressor, it is possible to determine the refrigerant leakage of the valve of the compressor. The study analyzes two DNN models, a CNN and a LSTM network. The LSTM model achieves a slightly higher accuracy, 96.14% versus 94.5% of the CNN; however, the convolutional network trained faster than the LSTM network.
4. Discussion and Trend Analysis
A growing body of research has found that the use of neural network models improves the design and control of refrigeration systems. Different techniques have been used in applications such as cooling demand modeling; study and optimization of energy performance of the system; control of refrigerant systems through forecasting and decision-making models; and fault diagnosis or refrigerant characterization. Depending on the problem, some neural network models are more suitable than others. For system modeling, researchers are increasingly turning to the use of DNNs. As can be observed in the review, most of the recently published papers use DNNs for refrigeration system modeling. These models require a larger number of samples for their application. As stated in [
91], the volume of data is often more important than the algorithm used. However, a large amount of data is not always sufficient to apply a model successfully. Data quality is as important as data quantity. The data needs to be representative. Usually, data contains errors, outliers, and noise. These factors make it difficult to train a neural network model and limit its capabilities. On the other hand, deeper networks have a higher risk of overfitting as they may model noise patterns or include non-relevant features into the model. For this reason, feature engineering [
81] or overfitting counter-measures techniques such as Dropout [
92] are commonly used in DNN modeling for refrigeration systems.
On the other hand, underfitting problems occur when the model is not able to generalize the problem properly. This issue can be solved either using more and better-quality training data, or by selecting a more powerful model. The use of more specific neural networks, such as CNNs or LSTM models, is growing in refrigeration system applications. CNNs have demonstrated to be one of the best methods for fault diagnosis in refrigeration systems. Although LSTM networks are effective for fault detection; CNNs are more appropriate for these tasks because, in addition to detecting a fault, they can diagnose the type of failure in the system.
As mentioned above, one of the major difficulties of DNN models is the need of large amounts of data for training. As discussed, studies such as [
85,
93] use more than 100,000 samples for training an LSTM network while studies such as [
94] use more than 200,000 samples and hundreds of neurons for design a fault diagnosis model for a HVAC system. Datasets with hundreds of thousands of samples for a given refrigeration system are often difficult to obtain. Furthermore, the number samples for each target variable are not always balanced. In classification problems, a sufficient amount of data from each category is needed. This factor is particularly important in fault diagnostic problems due to the difficulty to obtain a good number of examples of system failures. Due to the lack of such data, many researchers have been forced to use public databases such as ASHRAE to train and evaluate their fault diagnosis models [
72]. Other studies, such as [
60], have chosen to simulate system failure episodes to train their models.
Due to the increase in sensorization and IoT devices in industry, the quantity and quality of data available on refrigeration systems will increase significantly in the coming years. This phenomenon, in addition to improving the quality of new models, will lead to a greater diversity of studies on a wider range of refrigeration systems. Furthermore, the use of new data filtering and feature engineering techniques will be increasingly used to improve the efficiency and accuracy of the refrigeration system models.
On the other hand, hybrid modeling techniques will be explored to improve the control systems and reduce the energy consumption of industrial refrigeration systems. As observed in the review, the combination of different AI techniques improves the performance of the models, allowing more complex problems to be solved. The training time and computational cost of the models will be two essential parameters in the design phase of any model. Reducing the complexity of the models will be a key factor for their integration into real-time control systems.
Another of the main research extensions identified is the usage of new attributes for the modeling of refrigeration systems. Although researchers have started to use some alternative variables such as vibrations or acoustic signals, new variables will be explored in order to improve the accuracy of the models and the efficiency of the refrigeration systems. Applications include weather forecasts, energy tariffs, events in cold rooms such as door opening, product handling, or operation schedules.
5. Conclusions
This article presents a detailed evaluation of the state-of-the-art of neural network modeling for increasing the energy efficiency of refrigeration systems. In the literature revision, it is shown how neural network algorithms can be used to predict, optimize, control, and diagnose the behavior of industrial refrigeration systems.
The refrigeration industry is taking advantage of the effective data-based modeling techniques offered by the ANNs. These technologies are assisting experts in the refrigeration industry in the design and control more efficient systems. The optimization of energy consumption has been the main focus of study in the sector, where neuronal models have achieved significant energy savings compared to traditional control methods. The best modeling results have been obtained with DNNs, which have the ability to extract more complex features from the system datasets.
More specific neural network configurations have demonstrated effectiveness for solving certain refrigeration system problems. This is the case of CNNs that show the best performance for designing fault detection and diagnosis systems or LSTM networks, which are suitable for modeling time-series and predicting future behavior of refrigeration systems.
The wide-ranging possibilities offered by neural networks and their outstanding results are making them increasingly popular in the refrigeration industry. Even so, it is clear that the data-driven methods are not an alternative to physics-based models. Understanding the fundamental physics of the behavior of refrigeration systems and their thermodynamic properties is essential for the proper application of the data-driven models to the prediction of energy performance. For this transformation, AI systems engineers, software developers, and refrigeration system experts will collaborate to design and build smart control systems for saving energy in refrigeration systems.