1. Introduction
In Canada, there are over 5000 large supermarkets, and each consumes about 5000 megawatt hours (MWh) of electricity annually. Cumulatively, their energy consumption approximates 25 terawatt hours (TWh) per year, which equates to the annual output of three large power plants [
1,
2]. This massive energy usage underscores the position of supermarkets as among the most energy-demanding types of commercial buildings [
3].
Even though they meet the standard heating, ventilation, and air conditioning (HVAC) requirements, refrigeration systems in supermarkets are responsible for approximately half of their total energy consumption [
3]. This level of usage translates to an annual energy cost of around
$150,000 for refrigeration in a large supermarket. In terms of scale, these energy costs equate to nearly 1% of total supermarket sales. This is a significant expenditure, particularly considering the supermarket’s average net profit margin is also around 1%. Therefore, under standard operating conditions, a decrease in energy costs by 10% could potentially lead to an approximate 10% rise in profits [
1]. Therefore, optimizing the refrigeration system has a significant positive impact on both the environment and the profitability of these businesses.
In 2018, Behfar et al. [
4] comprehensively analysed common operating faults and equipment characteristics in supermarket refrigeration systems. Utilizing expert surveys, facility management system messages, service calls, and service records, their study revealed that refrigerant charge problems (leaks and overcharging) are the most common cause of equipment failure in supermarkets. Typically, a supermarket refrigeration system houses refrigerants between 3000 to 5000 lb in a closed circuit [
4,
5]. Annually, leak rates can average around 11%, spiking up to 30% in certain situations [
6]. Leaks in refrigeration systems can originate from four distinct sources. The first type is a gradual leakage through components over time, which often remains undetected until the leak becomes substantial. The second type is catastrophic or physical damage, resulting in significant refrigerant losses within a short period. A third source of leaks can be the activation of pressure relief devices. Lastly, small but consistent losses may occur during routine maintenance, repair, and refrigerant recovery, contributing to the overall leakage [
6]. These leaks in supermarkets lead to significant direct and hidden costs, including lost sales from perishable items, increased refrigerant expenses, compliance fines, higher energy consumption, food spoilage, and impacts on employee satisfaction and customer loyalty. Therefore, the economic impact of such outages can be much higher than the direct maintenance costs [
7]. Apart from these economic losses, refrigerant leaks have a direct impact on the environment as well. While nearly all newly constructed supermarket stores have moved away from using Hydrofluorocarbons (HFCs), these still remain the predominant refrigerants in existing supermarkets [
6]. HFCs pose a substantial environmental challenge, given their atmospheric lifespan, which can extend up to 14 years. Regulatory authorities worldwide have implemented stricter regulations regarding leak repair and store inspections in order to address this urgent issue. These new regulations encourage advanced monitoring techniques [
8,
9,
10]. Moreover, considering the regulations for environmental protection and food safety, along with the costs associated with refrigerant loss, the early detection of leaks in supermarket refrigeration systems is critically important. As a result, research into refrigerant leakage has gained substantial attention.
With the advent of Industry 4.0 [
11], fault detection and diagnosis (FDD) tasks, which were typically performed by humans, are being transformed into automatic fault detection and diagnosis (AFDD) through the use of data-driven methods. Although many AFDD methods have been developed for air conditioning and chillers [
12,
13,
14], comparatively little attention has been paid to commercial supermarket refrigeration systems. While basic refrigeration cycles are similar, supermarket refrigeration systems and chillers differ in many aspects. AFDD methods for industrial systems typically function by monitoring specific performance indices or features that are sensitive to a particular fault. When a performance index significantly deviates from its expected value or exceeds a predetermined threshold, the AFDD method can detect the presence of a fault [
15,
16,
17,
18,
19,
20,
21,
22,
23]. In other cases, it is addressed as a classification problem [
24]. However, due to the limited availability of faulty data in real-world scenarios, a significant class imbalance exists when one tries to use the classification approach with actual data.
Utilizing power consumption as a sensitive feature is a prevalent approach in the AFDD solutions for detecting leaks in supermarket refrigeration systems. Fisera and Stluka [
22] proposed an anomaly detection model using a regression-based approach that explored power consumption data that was fused with some additional sensor data. After being trained on non-faulty data, the model, upon encountering faulty data, produced prediction errors due to the disruption in the relationship between the input variables. Mavromatidis et al. [
21] leveraged an Artificial Neural Network (ANN) to predict energy consumption, and used thresholds to detect faults. Srinivasan et al. [
16] presented an approach to detect anomalous behaviours, including refrigerant leaks, by monitoring their energy signals alone. They employed a seasonal autoregressive integrated moving average (SARIMA) model and a regression model. They further incorporated additional sensors to reduce false positives, and they used the refrigerant level at the receiver for leak detection.
Aside from power consumption, a few studies have focused on using parameters that exhibit thermodynamic behaviour changes in a system, including variations in temperature, pressure, mass flow rate, and heat transfer rates. Yang et al. [
25] employed Kalman Filter and Extended Kalman Filter techniques to generate residuals, which were then analysed to detect and isolate four types of sensor faults: drift, offset, freeze, and hard-over, specifically for two temperature sensors. The detection was conducted by comparing the residual to a predetermined threshold. A cumulative sum (CUSUM) method was used for residual evaluation to enhance the clarity of the residual deviation. In a subsequent study, Yang et al. [
26] expanded on fault detection and isolation in supermarket refrigeration systems by introducing the Unknown Input Observer method, which is capable of detecting both sensor and parametric faults. While these studies did not explicitly focus on refrigerant leak detection, their approach to sensor fault detection and isolation could potentially be applied to leak detection in refrigerators. This depends on the specific characteristics of the leak and its impact on the sensors.
Assawamartbunlue and Brandemuehl [
17] used the Refrigerant Leak Index (RLI) parameter for their study. They developed a probabilistic approach to detect refrigerant leakage in supermarket refrigeration systems. The authors created a belief network model that predicts the liquid volume fraction in the receiver, utilizing seven variables derived from measured data and the average temperatures of refrigeration cases. The RLI values, calculated from the difference between the predicted and observed liquid volume fractions, served as the target variable for leak detection. Moreover, they constructed an ANN model to make the technique faster and more practical for field implementation with small microcomputers. Their study provides a unique perspective on using probabilistic models and neural networks for leak detection in supermarket refrigeration systems. In a different study, Wichman and Braun [
20] presented a decoupling-based diagnostic method for detecting faults, including refrigerant leaks in commercial coolers and freezers. This method uses decoupling features and parameters uniquely influenced by individual faults to manage multiple simultaneous faults. They used the difference between the superheat and subcooling temperature values as the undercharge faults’ decoupling feature. Their study explained the potential of this method for real-world applications, but also emphasized the need for further research to improve the decoupling of certain features, such as refrigerant charge problems, from other faults.
Behfar et al. [
18] examined the potential of AFDD methods in supermarket systems. They evaluated two distinct AFDD methods: a rule-based method and a data-driven method. The rule-based method, originally developed for HVAC systems [
27], exhibited sensitivity to certain faults, like a broken condenser fan and the first lighting fault scenario. However, it was less sensitive to undercharge fault scenarios. On the other hand, the data-driven method [
21] effectively detected changes in energy consumption, but was less effective when input variables fluctuated significantly during normal operation. The authors suggested that future research could improve fault detection in supermarket refrigeration systems by combining data-driven methods with more detailed input data, such as pressure and temperature measurements and control modes. They also proposed weighing the diagnostic inputs to enhance the effectiveness of the AFDD methods. They provided valuable insights to improve AFDD methods for the detecting of leaks and other faults in supermarket refrigeration systems.
This paper introduces a novel leak detection framework that is explicitly tailored for supermarket refrigeration systems. The main contributions of this paper can be summarized as follows:
A robust leak-detection framework designed for supermarket refrigeration systems: None of the existing solutions detect both slow and catastrophic leaks, which show two contrasting behaviours and should be treated differently.
A framework that is independent from the refrigerant level sensor of the receiver tank: Most existing solutions rely on the refrigerant level sensor of the receiver tank, which is unavailable in most supermarkets. Instead, this proposed solution relies only on the thermodynamic properties and energy data available in any supermarket refrigeration system.
Use of a false alarm mitigation mechanism: None of the existing solutions for leak detection in supermarket refrigeration systems have implemented a false alarm mitigation mechanism to improve accuracy, even during the transition of the system’s control modes.
The remainder of the paper is organized as follows.
Section 2 gives brief background information on the proposed approach, and
Section 3 describes the methodology followed for catastrophic and slow leak detection.
Section 4 presents the results of the proposed algorithms using real-world data obtained from supermarkets in Canada. Finally, the conclusions are explained in
Section 5.
3. Methodology
Modern supermarket refrigeration systems possess a plethora of probes to collect data, covering the entire refrigeration process from the compressor, condenser, and expansion valve to the display cases. The focus of this study is primarily on the typical sensor values that are common in most supermarket refrigeration systems. Most operational data in a typical supermarket refrigeration system is highly susceptible to external ambient temperature and humidity. Supermarket refrigeration systems function in various modes, leading to fluctuations in operational data as the system transitions between these conditions. For instance, there are regular defrosting cycles, changes in the state of the heat reclaim valve, and alternating split states in the condenser. Therefore, relying on raw sensor data values for anomaly detection can result in many false positives, as these operational mode changes can mimic or mask genuine anomalies [
20]. Therefore, a prediction-based anomaly detection model incorporating these external factors has been developed to accurately identify true anomalies. The CatBoost regressor is chosen for time series prediction, due to its promising capabilities for short-term predictions, its robustness to outliers, its ability to capture non-linear relationships, and a reduced risk of overfitting. Moreover, the CatBoost regressor demonstrates better performance for this data than other similar models like XGboost and LightGBM. There are two major types of leaks that occur in supermarket refrigeration systems: catastrophic leaks and slow leaks. Separate algorithms have been developed to detect each type of leak under the umbrella of a single framework.
3.1. Data Preprocessing and Feature Engineering
The data for this study are collected from various supermarkets in Canada with HFC as the refrigerant. Due to the confidentiality reasons, the names of the supermarkets cannot be disclosed, and therefore code names such as Supermarket 1 and Supermarket 2 are used in this paper. Past leak events are identified through a comprehensive approach that combines analysing data, expert insights, service records, and customer complaints, effectively detecting slow and catastrophic leaks. The datasets are then carefully extracted from different supermarkets to include 3–6 months of data. These datasets are selected to have normal non-leak operational data before and after the leak event.
Typically, companies maintaining supermarket refrigeration systems manually analyse vast data streams from these systems to detect leaks. With domain expertise and references to the refrigeration literature [
23,
37,
38], critical parameters for the proposed model are identified and listed in
Table 1. However, the industrial dataset presented practical challenges. For instance, some stores have probes to gather specific parameters, while others do not. Additionally, some parameters have numerous missing values and noise, rendering them almost unusable. Using the available data and employing the thermodynamic library named CoolProp [
39], a set of new features is created to gain insights into the thermodynamic behaviour of the system, as shown in
Table 2.
The refrigerant level value at the receiver has not been used for any modelling, since it is not available in all supermarkets. Additionally, feature engineering techniques are employed to extract time series-specific features. These include lagged values, cyclic encoding of the hour of the day for daily seasonality, and cyclic encoding of the day of the week for weekly seasonality.
For each dataset, a partial correlation analysis is performed for each target parameter to select appropriate input features and mitigate the effect of multicollinearity. Autocorrelation function (ACF) and partial autocorrelation function (PACF) plots are analysed to determine the optimal number of lagged features required for the model.
Figure 2a,b depict the ACF and PACF plots of an input feature, and
Figure 2c,d show the ACF and PACF plots of a target feature, respectively. A significant autocorrelation is also observable at the 96th lagged value, corresponding to the same time on the previous day. Moreover, L2 regularization is used to prevent overfitting.
The leak events, service incidents, and other special events (e.g., power outages) are identified first for a given dataset. Then, the dataset is divided into training, validation, and test sets. Usually, a test set is selected to include a leak event. Therefore, only one month of data is used to detect catastrophic leaks, and, for slow leak detection, two months of data are selected in the test set. The training and validation sets are selected to include only normal (non-leak) operation data without outliers. Depending on the availability of data, two to three months of data are used for the training set, while fifteen to thirty days of data are used for the validation set. A gap is maintained between each set to stop the possibility of data leakage due to the use of the lagged values (
Figure 3).
Earlier in-depth analyses were conducted by Tassou and Grace [
23,
37], and Cho et al. [
40] investigated how various operational characteristics are affected by changes in the refrigerant levels, particularly in chillers and transcritical CO
2 heat pumps, equipped with a flash tank. Their research indicated that the coefficient of performance (COP), superheat temperature, subcooling temperature, mass flow rate, compression ratio, and power consumption respond to the changes in the refrigerant level. Furthermore, Fisera and Stluka [
22] emphasised the potential use of parameters like COP for system-level fault detection in supermarket refrigeration systems. Therefore, in this proposed solution, COP, superheat temperature, subcooling temperature, mass flow rate, compression ratio, and power consumption are selected as the target features when developing prediction models.
Figure 4 shows the overall architecture of the proposed algorithm.
3.2. Catastrophic Leak Detection
Catastrophic leaks involve the rapid release of refrigerants from a refrigeration system, often due to equipment failure. Since the system loses a significant amount of refrigerant during a catastrophic leak event within a short period of time, there is a rapid fluctuation in system parameters. In modern supermarket systems, catastrophic leaks are rare, namely due to consistent and proactive maintenance. However, they are still possible due to their massive scale. This paper proposes a prediction-based anomaly detection algorithm to detect catastrophic leaks, using 15 min sampled data. A CatBoost regression model is used for prediction, and a novel non-parametric dynamic thresholding method is used for anomaly detection.
A CatBoost regression model
is trained using non-leak data
to accurately predict the target variable
during normal operation. Lagged values of operational data (including temperature, pressure, energy, and calculated thermodynamic data), feature-engineered data to extract time-series information, system status changes, and both outdoor and indoor humidity and temperature values are compiled into a tabular format. This serves as input for the CatBoost regression model. Prediction error values
are obtained in real-time and are fed into the anomaly detection algorithm. Due to the complexity of the system, these prediction error values always contain noise, and the anomaly detection algorithm must be capable of distinguishing anomalies from this noise. A non-parametric dynamic thresholding algorithm, proposed by Hundman, Kyle et al. [
41], is used to detect these anomalies. This proposed algorithm comprises the following steps: (1) dynamic error thresholding and (2) false positive mitigation.
3.2.1. Dynamic Error Thresholding
At time step , a prediction error value is evaluated as , and the error series defined as , where symbolizes the length of past error values window applied to assess current errors.
Setting an appropriate threshold for the error value series is necessary to reliably identify anomalies. Initially, an exponentially weighted average technique is used to generate the smoothed error series;
is used to get rid of the sharp spikes. Then, a set of potential threshold values is generated, as follows in Equation (1):
Each possible threshold
is determined as shown in Equation (2):
here,
represents the tolerance level of the threshold in terms of the number of standard deviations away from
, generally ranging from 2 to 10. The optimal threshold
is given by Equation (3):
where:
Once
is identified, it allows for the most effective separation between normal and abnormal data. This approach also curtails any excessive bias towards the abnormal data, preventing overly eager behaviour. Finally, an anomaly score is assigned to all the values in
in
, which are above
, to indicate the severity of the anomaly. The anomaly score (
) is calculated as in Equation (4):
3.2.2. Mitigating False Positives
Since the prediction error is noisy, there is a risk of having plenty of false positives. To deal with this issue, an extra vector
is introduced, comprising the highest value from each anomalous subsequence, organized in a descending manner, that is,
The maximum smoothed error that is not classified as an anomaly is also added to the end of the vector, specifically
. When incrementally traversing through the sequence, for each step
within
,
is calculated using Equation (6), which denotes the percentage reduction between subsequent errors in
:
Another threshold is designated as the expected minimum percentage decrease. If at any step , is surpassed by , then all the errors where and their corresponding anomaly sequences are affirmed as anomalies. Conversely, if and the same condition is applicable for all , the associated error sequences are re-categorized as normal. The accurate selection of the threshold ensures a clear differentiation between errors arising from regular noise within the stream and genuine anomalies that have occurred in the system.
3.3. Slow Leak Detection
Slow leaks are both the most common and the most challenging type of leak to detect. Such leaks do not significantly affect the operational characteristics or the efficiency of the system until a substantial amount of refrigerant has leaked out. Consequently, it requires a different approach to detect slow leaks. A proposed approach for slow leak detection involves a combination of empirical observations and predictive modelling. There is an observable shift in the partial correlation coefficient between the input variables (X) and the target variables (Y), as the system transitions from a non-leak state to a leaked state. The partial correlation coefficient is then calculated, taking into account any confounding variables, such as the indoor and outdoor temperature and humidity.
The CatBoost regression model is used to construct the predictive model
, utilizing the 1 h sampled data, thus establishing the relationship
during normal operations, where
denotes the residual error. As the system enters a slow leak state, changes in the correlations between
and
result in the prediction error
given in Equation (7):
The squared error
is accumulated over time, forming the cumulative sum of squared errors (
) The bi-weekly rise (
) of the
is computed from the hourly data, given in Equation (8), where
represents the current hour and
represents the hour exactly two weeks prior:
Gradient
of
and the rolling mean
and the standard deviation
of the gradient are calculated over a two-week window. Rapid increases in
values are identified when
. The amount of rapid increase and the bi-weekly sum of rapid increases
are computed. The bi-weekly rise, which is due to gradual increases
, is then calculated, as shown in Equation (9):
Finally, a new time series () is generated by resampling the data to daily intervals. This is completed by taking the final value of in each day. It focuses on the longer-term, more gradual changes in cumulative squared errors. The proposed detection strategy sets a threshold () for . Under normal operation, remains consistent without any trend, but with the onset of a slow leak, tends to increase. The slow leak detection criterion is thus: if , a slow leak is likely to be present. The selection of involves a trade-off between minimizing false positives and ensuring early detection of slow leaks.
3.4. Model Evaluation
3.4.1. Accuracy of Prediction Model during Non-Leak Operations
During the non-leak operation, the prediction model should be able to predict the modelled target parameter with a minimum prediction error. The mean absolute percentage error (MAPE) and the
R2 Score are used to measure the performance of the prediction model (Equations (10) and (11)):
where:
is the total number of data points;
is the actual value for the observation;
is the predicted value for the observation.
Furthermore, keeping the false alarm rate (FAR) to a minimum for the non-leak data is essential. This can be represented by the following equation, where FP stands for false positives, and TN represents true negatives:
3.4.2. Accuracy of the Anomaly Detection Model
To measure the performance and utility of the proposed anomaly detection model, TP, TN, FP, and FN are first calculated, providing fundamental insight into the model’s ability. Then, the precision, recall, and F1 Scores (Equations (13)–(15)) are calculated. Lastly, the time to detect the leak metric is crucial in assessing the model’s timeliness and efficiency in recognizing leaks, a factor of paramount importance in preventing substantial damage or loss.
In the context of refrigeration leak detection, a leak is often an anomalous segment. When evaluating similar time-series anomaly detection models, the point adjustment protocols [
42] are used; this is because the detection of the anomalous segment is the ultimate objective. For refrigeration leak detection specifically, the time of the leak event plays a crucial role. When calculating precision and recall, it is essential to penalise any delayed discoveries. Therefore, a modified point adjustment protocol is used in this study. All points within the anomalous region after the first successfully detected anomalous point are marked as successfully detected anomalous points, regardless of whether they were accurately captured by the model or not. An example of this protocol is shown in
Figure 5, and 0.5 is used as the threshold. The first row represents the ground truth with 12 points, and, within the shaded square, two anomalous segments are highlighted. The second row displays the smoothed prediction error values. The third row presents the point-wise detector results with the specified threshold. Finally, the fourth row showcases the detector results after adjustment.