You are currently viewing a new version of our website. To view the old version click .
Processes
  • Article
  • Open Access

8 November 2025

Data-Driven Detection and Prediction of Refrigeration Equipment Failures Using Rough Sets Theory and the Internet of Things

,
and
1
Department of Automatic Control and Computer Engineering, Faculty of Electrical and Computer Engineering, Cracow University of Technology, 31-155 Kraków, Poland
2
Efento Sp. z.o.o., 30-701 Kraków, Poland
*
Author to whom correspondence should be addressed.
Processes2025, 13(11), 3618;https://doi.org/10.3390/pr13113618 
(registering DOI)
This article belongs to the Section AI-Enabled Process Engineering

Abstract

This article presents a system for the detection and prediction of faults in refrigeration equipment developed using rough set theory, a method from artificial intelligence and leveraging Internet of Things (IoT) technology for continuous data collection. The system targets the most frequent failures (fan, compressor, and controller faults), allowing early detection and timely intervention. Measurement data are transmitted to a cloud platform for analysis within a distributed architecture, ensuring scalable and efficient processing. Data-driven diagnostic models were built on rough set theory, enabling decision-making based on incomplete or imprecise data. Experiments conducted on both real and simulated datasets demonstrated high detection effectiveness, with accuracy ranging from 76% to 90% across all monitored fault types. Diagnostic parameters were analyzed to assess the system performance comprehensively. The paper also discusses potential directions for further development, including adaptation to other refrigeration devices and integration of the decision-making system into IoT devices, opening the way for fully predictive maintenance solutions.

1. Introduction

Diagnostics of technical appliances constitute an integral part of their development and operation. It enables the detection and analysis of emerging faults and serves both to improve the product and, above all, to maintain the functionality of the equipment, plan inspections, and schedule necessary repairs. A model-based approach to fault detection and diagnosis in engineering systems as well as practical information on preventing product deterioration, performance degradation, and machinery damage was presented in []. In the most advanced systems, fault detection and failure prevention are possible through proactive maintenance based on a predictive maintenance strategy []. Recent achievements in the area of data-driven process monitoring, fault diagnostics and forecasting can be found in up-to-date surveys [,].
In [,], knowledge-based models, model-based models, and data-driven models used in predictive maintenance algorithms are described. Hybrid models combine the characteristics of two or more models. Artificial intelligence methods belong to data-driven models. According to [], data-driven forecasting accounts for 28.9% of the reviewed literature.
The rapid development of Internet of Things (IoT) solutions has enabled remote monitoring of both individual devices and entire processes, often resulting in greater efficiency and better control over critical operational parameters of devices or processes. However, the huge amount of data generated by IoT devices presents challenges for engineers and researchers related to effective processing, analysis, and utilization of this information for decision-making []. Analytical systems based on artificial intelligence (AI) provide a response to these needs, enabling anomaly detection and the implementation of the aforementioned predictive maintenance strategy. Such solutions not only reduce costs associated with downtime caused by failures but also minimize possible losses that may result from them [,,]. In studies [,,,], numerous artificial intelligence (AI) and machine learning (ML) methods used in industrial processes and equipment were compared.
Heating, ventilation, air conditioning (HVAC), and cooling systems are currently the focus of specialists conducting advanced research in this field []. Several publications have presented the results of applying AI to such systems [,]. Commercial solutions for predictive maintenance and diagnostics of HVAC equipment based on IoT and cloud applications are offered to clients by CoolAutomation, Petah, Tikva, Israel. However, its website does not provide either quantitative data for comparison or the methodology used [].
Comparison of four methods: neural networks (ANN), fuzzy logic (FL), decision trees (DT), and rough sets (RST) in terms of features and limitations that are essential for our research on detection and prediction of refrigeration equipment failures was performed, which is shown in Table 1.
Table 1. Comparison of selected AI methods features.
When artificial neural networks (ANN) [] and rough sets (RST) [] are compared, each of them has its unique advantages and limitations, depending on the type of data, the purpose of the analysis, and the interpretative requirements. In searching for a method that best meets our needs, the following factors were considered: lack of prior knowledge about the data (e.g., statistical distributions), reduced number of attributes (e.g., sensors), low computational requirements, ability to handle uncertain and incomplete data, ability to generate easily interpretable decision rules, and system transparency (e.g., simplicity). Rough sets provide most of these expected features. The necessary cost of the RST method is associated with the need to discretize continuous data. The high generalization capability of ANN, i.e., the ability to perform well with new, unknown data and to easily adapt to different types of data and problems, is not essential in our task. Additionally, the high computational complexity and lack of transparency of the system (black box) are obvious disadvantages of neural networks for users.
Original research related to artificial neural network (ANN) can be found in []. Recursive ANN varying with their complexity were used for three generalized classes of device states: no fault, light fault, and severe fault with reference to boilers operation. Three quantitative measures (accuracy, recall, and F1) were given separately for combinations of all models and device states. Unfortunately results for both classes of failure states are unacceptable. The authors also used three aggregated measures that combine results for all three states together: aggregated accuracy, macro F1, and MCC (Matthews Correlation Coefficient), which makes comparison with other studies impossible.
When other machine learning models like Support Vector Machines (SVM) or Random Forest (RF) are considered and compared against RST they are usually competitive in terms of performance for large data. They reveal lower interpretability but higher generalization ability. They do not need discretization.
In particular, the RF method is more complex and requires significantly more resources than the RST method. The RF method is not intended for use on already discretized data. It requires a separate ablation analysis step using another method. This random model does not generate decision rules. It does not guarantee that the first solution found will be sufficiently good.
In a recent article [], the use of the RF method is closely related to our research topic and provides some quantitative data to compare both methods. This method is intended for remote predictive maintenance of supermarket refrigeration systems, which fall under the class called Refrigeration and Cold-Storage Systems (RCSS). The system was trained on data from 680 refrigeration cases and tested on data from 1585 devices. The authors state that the system uses only temperature data from the refrigeration systems, which are prone to many disturbances and noise. It is not clear whether the system can handle uniform devices or also diverse ones. The RF model generated 40 features of varying importance. As a result, only general information on failures was available. Detailed metrics were provided only for the training sets. The accuracy and sensitivity obtained for the training data are high: 1 and 0.91, respectively. For the test data, the same metrics are 0.89 and 0.46. Article [] contains a review of related works.
In article [], time series forecasting is proposed for the maintenance of refrigeration systems. The case of cooling systems in supermarkets is particularly considered: refrigerated display cases and cold rooms. Unlike most other approaches, the focus is on detecting normal behavior. Measurements obtained from sensors form time series that are used to model the system’s behavior under normal conditions. The detection of abnormal behavior is reported. The methodology is based on signal processing and statistical tools. The authors state that the results of preliminary experiments are promising, although no quantitative data confirm this.
This article presents the design and empirical validation of a fault detection and prediction system for refrigeration equipment built using RST and IoT technology.
The system to be designed is expected to have the following features:
low cost of implementation and exploitation
high interpretability to humans, which is important in technical diagnostics
ability to operate on unlabeled data, which is beneficial in systems where complete data is hard to obtain
rapid failure prediction with type-based classification
automatic identification of the minimal set of features (signals) required for diagnosis, resulting in attribute reduction and simplifying the model
generation of ‘if–then’ decision rules, easy to interpret and implement in expert systems
handling uncertainty
good performance.
Taking into account new tools for rough sets methodology developed in [,], and prior experience in using this method in diagnostics of electrical devices, the RST method was selected for our research project.
To the best of our knowledge, RST has never been used in RCSS diagnostics nor compared with other, more commonly used methods. Compared to the RF method, RST is much simpler and more precise, focusing on a specific type of device and different types of faults being diagnosed.
The starting point for our research was the articles [,]. In [] an initial concept of rough inference software system for computer-assisted reasoning was presented with application to diagnostics of a power transformer. Next, in [], more complex concept of a diagnostic system for cooling appliances, based on automata theory [,], time series analysis [,,], and rough set theory [,] was developed. In the present paper this concept was verified under real-world conditions through the construction of a monitored experimental setup and a diagnostic system. Based on statistical data covering the most significant faults, a series of experiments simulating specific failures was conducted. The designed system, using modern technologies, collects and analyzes measurement data from the device and makes diagnostic decisions based on these data.
A Moore automaton model was used to describe the behavior of the refrigeration device, in which the state transition function depends on the current state and input, while the output function depends solely on the state of the automaton. Sequences of measurement data representing physical values from the device are treated as time series, whose properties can be analyzed and trends identified in order to determine inputs that trigger transitions between states of the automaton [].
The rough set theory presented in [,] has multiple applications, including medical and technical diagnostics as well as predictive fault detection. It may serve to build decision-making systems that can generate decision rules based on incomplete or imprecise data, defining the relationships between the states of the system and the observed values of input attributes. Because the combinatorial problems that arise during the construction of such systems belong to the NP-complete class (in particular cases also encountered in the logical synthesis of digital circuits), the use of approximate algorithms is justified.
Diagnostic experiments at the research stand were conducted by simulating various types of failures, and the results were statistically analyzed. Based on these results, evaluation metrics such as accuracy, precision, sensitivity, the positive likelihood ratio (LR+), and the F1 score were determined, which allow for the assessment of the diagnostic system’s quality [,].
The obtained numerical data undergoes detailed analysis. The conclusions from the research justify the thesis about the correctness and effectiveness of the proposed solution.
In summary, the study presented in the article addresses a research gap in the development of systems for detecting and predicting failures in refrigeration equipment. There is a need to create a simple, low-cost, user-friendly distributed system with edge-deployable diagnostics for commercial refrigeration using a minimal number of sensors and computational resources. The aim of the research is to design a system with the desired characteristics using available AI and IoT technologies.
This article contributes to fulfilling these requirements by:
identifying a set of predictable refrigerator malfunctions based on manufacturer data
designing a dedicated test setup
building a refrigerator behavior model
selecting an artificial intelligence method
experimental selection of measurement methodology, signal representation, and processing
optimization of the resulting discrete model and generation of detection rules
statistical verification of the designed system’s quality
designing IoT system architecture for monitoring distributed refrigeration units along with implementation details
the possibility of extending the solution to enable monitoring of other types of supported devices using the same methodology, without changing the overall system architecture.
The article is organized as follows: Section 2 presents a description of typical faults occurring in refrigeration systems and statistical data on their occurrence. Section 3 describes the measurement system used for data collection. Section 4 provides a description of the fault detection model based on rough set theory. Section 5 discusses the obtained results, including an evaluation of the system’s effectiveness under real-world conditions. In Section 6 the design methodology for the RS failure prediction system is formulated. The article concludes with a summary highlighting the main research findings and potential directions for further development of the solution presented.
In addition to describing qualitative and quantitative research results, the authors have made the measurement data available in a public repository.

2. Refrigeration Equipment Failures

To identify the most common causes of failures, an analysis of the service documentation provided by a domestic manufacturer of refrigeration equipment, JUKA, Niepołomice, Poland [], was conducted, covering the period from 2011 to 2019. Failures that did not directly affect the refrigeration function of the device, such as lighting failures or mechanical damage caused by improper use, were excluded from the analysis. The results of the fault analysis are presented in Table 2.
Table 2. Analysis of the frequency of failures for specific types of refrigeration equipment based on service protocols collected between 2011 and 2019 by the domestic manufacturer JUKA [].
The analysis revealed that the main causes of failures are fan failures, compressor failures, controller and associated temperature sensor failures, and evaporator heater failures. These categories collectively account for 98.6% of all refrigeration system failures and 82.8% of total commercial refrigeration equipment failures [] and are highlighted in bold font. These failures occur independently, and the diagnostic system is tasked with independently predicting each as early as possible. It should be noted that due to their nature, it is not possible to predict failures of the controller, temperature sensors, or evaporator heaters (condensate leakage). These failures can only be detected when they occur. Consequently, the focus was placed on predicting failures of fans (evaporator/unit chamber) and compressors, which manifest through changes in the operational parameters of the refrigeration device. The proportion of these two classes of predictable failures is given in the last column of Table 2.
For each of the selected failures, based on discussions with industry experts and preliminary tests, their causes, consequences, impact on changes in operating parameters, and methods of forecasting/detection based on changes in operating parameters were determined. The monitoring points within the refrigeration device were then identified.
Initially, data were collected for the following parameters: temperature and humidity in the unit chamber, ambient temperature and humidity, energy consumption, refrigerator temperature, and temperatures measured by three probes at designated locations within the device.
After conducting preliminary experiments and analyzing time series, we concluded that neither environmental parameters are significant nor can changes in energy consumption be linked to specific failures of the cooling unit.
As a result of the ablation analysis, our initial diagnostic model was reduced to just five temperature parameters inside the refrigerator. As we will show in Section 4, in the final diagnostic model, after removing unnecessary complexity, the temperature measurements in the storage compartment and the unit chamber alone were sufficient to generate decision rules.
Minimizing the number of measurement points provides two main benefits: reduced system implementation costs and decreased data volume that must be transmitted to the cloud platform and analyzed by the analytical system. Furthermore, relying solely on temperature measurements makes the solution more independent of the refrigeration device model, as no control signals from the device’s control system are used for fault detection/prediction.
In order to collect measurement data and conduct experiments, a dedicated stand was constructed based on a Juka Vienna refrigeration unit []. This unit, shown in Figure 1, consists of a storage compartment maintained at a set temperature for storing products, and a unit chamber containing the essential components for refrigeration operation (compressor, controller, fans, and power module). During the experiments, measurement probes were placed in the device as required (sometimes based on the manufacturer’s prior experience), ensuring that both storage compartment temperatures and key points within the unit chamber equipment were recorded (Figure 2).
Figure 1. Refrigerated cabinet and monitored parameters. The unit chamber is marked with red, while the storage compartment is marked with blue [].
Figure 2. Interior of the unit chamber and the placement of temperature probes 1, 2, and 3.
For continuous monitoring of specific points, wireless temperature loggers based on digital sensors were used, offering high accuracy from −40 °C to +85 °C with a measurement error not exceeding 0.5 °C. The measurements covered both the period of fault-free operation and a certain time period after the occurrence of a specific failure. Loggers recorded data every minute for 40 min: 10 min before the fault and 30 min after its occurrence. As explained in Section 4, the sampling rate and time windows were determined experimentally in order to ensure correct detection of events in time series and optimal periodic transmission of measurements data.
Open access to the developed software in Python 3.12, measurement data, spreadsheets, etc., is provided via GitHub, as declared in the Data Availability Statement.

3. Measurement System Architecture and Applied Technologies

The measurement system was designed using a distributed architecture, where individual data loggers communicate with a gateway via the Bluetooth Low Energy (BLE) protocol. This technology was selected due to its low energy consumption and sufficient range for monitoring individual refrigeration devices. BLE also ensures stable communication in industrial environments, where electromagnetic interference may occur [].
The system gateway is equipped with an NB-IoT (Narrowband Internet of Things) communication module, enabling transmission of collected data to a cloud computing platform. NB-IoT was chosen for its energy efficiency, extended coverage, and ability to operate under challenging conditions where conventional cellular networks may be limited []. The gateway also features local buffer memory to safeguard data in the event of temporary disconnection from the server. Each data logger is assigned a unique serial number, allowing unambiguous identification of the data source and facilitating system configuration.
Since the data loggers are battery-powered, special attention was paid to optimizing energy consumption by the measurement modules. An adaptive algorithm for data transmission frequency was implemented, adjusting the transmission rate according to the dynamics of the measured parameters. During periods of stable operation in the refrigeration system, the transmission frequency is reduced, significantly extending the battery life of the devices. Wireless HS6 temperature loggers from Efento, Kraków, Poland [] with data transmission via Bluetooth were used for temperature monitoring. They can operate with a minimum interval of 1 min between consecutive measurements. With this type of usage, the battery life is 4–5 years.
To simplify the design of the data loggers and minimize implementation costs, no processing of measurement data occurs on the loggers or gateway. Data is transmitted to the cloud computing platform for processing. The only operations performed locally by the loggers are the validation of measurement values. Algorithms for measurement correctness verification and outlier detection were implemented, enabling elimination of erroneous readings at the data acquisition stage. Each measurement is time-stamped and includes information about the logger’s status, allowing verification of data reliability. Additionally, the measurement system includes self-diagnostic mechanisms that monitor battery status, wireless connection quality, and proper functioning of the loggers.
Measurement data from the loggers are transmitted to the cloud platform for further processing and analysis by artificial intelligence algorithms. A dedicated API was developed to provide two-way communication between the measurement system and the analytical module. In addition to receiving measurement data, the API allows remote adjustment of logger settings. The system also enables the configuration of alarm thresholds for individual signals and definition of user notification rules, which can be modified remotely depending on the application requirements. The architecture of the implemented measurement system with IoT data flow to decision outputs is illustrated in Figure 3.
Figure 3. Diagram of the measurement system architecture integrated with the analytical system.

4. Analytical System for Failure Prediction and Detection

The analytical system for failure prediction and detection was developed based on automata theory and rough set theory, which enable the simplification of diagnostic models by reducing the number of states and attributes in the decision tables [] required for accurate diagnostics. This optimization improves decision-making speed, reduces computational costs, and ensures unambiguous diagnoses.
Systems based on rough set theory can be applied to both the diagnosis of simple systems (e.g., transformer diagnostics based on gas ratios analysis []) and more complex discrete systems, which can be modeled as deterministic finite automata (state machines)). In such automata, normal operating states as well as states representing system failures—requiring service intervention or replacement of faulty components—are defined. Transitions between states are modeled using configurations of selected attribute values, which serve as the automaton inputs.
In the model of the refrigeration device, the automaton describes state behavior, which basically relies on transitions between two normal operating states. Only after a component failure is detected does a transition to the designated emergency state occur, triggering appropriate failure-handling procedures. Replacement/repair of the device component by service personnel is required before the consequences of the failure cause damage to the refrigeration unit or the products stored within it appear, thereby preventing significant losses.
Automated diagnosis of systems modeled by state machines is based on rough sets, and information attributes may represent directly measurable physical quantities (e.g., temperature and humidity) obtained through periodic sampling at discrete time intervals.
Attributes may also be defined indirectly, based on processed raw measurement data. The observed system changes (e.g., exceeding a predefined attribute threshold or a change in trend) concern sequences of attribute values and calculations performed within so-called time windows [,].
In the diagnostics of refrigeration equipment, we are dealing with the system described above, characterized by states of normal operation and occasional failures of selected system components (compressor, or the fan of the condenser unit, or the evaporator). Although, from the perspective of automatic control, the time characteristics of the device are relatively slow-varying, fault detection should be as rapid as possible in order to maximize the available reaction time for service intervention.
The operation of the refrigeration device was modeled using a deterministic finite Moore automaton consisting of finite sets of inputs (I), internal states (S), and outputs (O). The state behavior of the automaton is defined by the transition function, while the outputs are computed by the output function [].
A simplified Moore machine model of a refrigeration device, comprising six distinct states, is illustrated in Figure 4. The initial state S0 represents the state of the device from the power-on, until the transition into primary (normal) operational state S1. The second normal operating state S2 corresponds to the automatic defrost. The remaining three states—compressor failure S3, condensing unit fan failure S4, and evaporator fan failure S5—are reached from the state S1 under inputs i3, i4, and i5, respectively, and require a service intervention. Therefore, each operational state S1–S5 is uniquely associated with the corresponding output O1–O5.
Figure 4. Simplified Moore automaton model of the refrigeration system.
In the analytical system, signal time series are discretized into the inputs i1–i5 of the automaton. The mapping onto these inputs is based on a vector of processed measurement data that is obtained through smoothing and differencing of sampled input signals, recorded by temperature loggers located in the storage compartment and inside the condensing unit.
Analysis of measurement data from the first batch of measurement data (30 min) for normal operation and unit chamber failure, provided in CSV format, showed that the measurement data belonging to the first group fell within narrow value ranges, whereas regardless of the type of failure, measurements starting at the moment of failure—without reference data preceding the failure—made unambiguous trend change detection difficult or impossible. Consequently, subsequent batches of data consisted of a 10 min period before the failure and 30 min from the moment the failure occurred (a total of 40 min).
The justification for using a 10 min + 30 min measurement window comes from the assessment of time constants in the system and the smoothing window. Five separate objects are considered, each for a pair of input and output signals in the time domain.
We used an approximation of higher-order objects as first-order objects with delay, where T is the equivalent time constant and d is the delay. However, after powering on, the actual power is delivered to the cooling unit by the control unit gradually in the following steps: 120 W, 300 W, 540 W, 600 W, and 660 W—over 5 min. Therefore, the evaluation of Ti and di for each specific ith object cannot be easily calculated based on the time response to a step signal. With the modified approximation method and a maximum chamber load of 18 kg, the parameters of all objects were approximated as follows:
Unit chamber: T1 = 11 min, d1 = 1 min;
Refrigeration chamber: T2 = 23 min, d2 = 1 min;
Sensor 1: T3 = 5 min, d3 = 0;
Sensor 2: T4 = 2 min, d4 = 0; and
Sensor 3: T5 = 11 min, d5 = 3 min.
As explained in Section 4, the smoothing windows must have 7 time units (each unit = 1 min). Therefore, adding a certain margin, we assumed that the time period before a failure occurs = 10 min > 7 min, and the time period after the failure occurs = 30 min > 23 min.
Analysis of the obtained data for different types of failures led to the conclusion that it was still difficult to detect the moment of trend change associated with the failure due to continuous changes in the sign and value of increments of individual measurement data in successive time intervals. In order to overcome this difficulty, well-known methods of processing/transforming time series data were applied to the measurement data stream, namely smoothing and differencing. These methods have their own variants and parameters; they can also be combined but there are no rules as to which combinations are optimal.
Calculating moving averages online is possible when 2q + 1 elements of the time series are available at time t. Therefore, the arithmetic moving average at time t is described by the formula: M t = 1 2 q + 1 j = 0 j = 2 q   x t j , where q = 1, 2, 3, … is the order of this moving average. The second transformation of the input data at time t, that is, differencing, is described by the formula: R t = x t x t 1 . Differencing operations can be composed, meaning that a previously differenced sequence can be differenced again. Similarly to smoothing, differencing can be performed iteratively starting from the value t = 2 q + 1 + k r with a step k r k w , provided that sufficient detection effectiveness is ensured.
Experimentally, the following composition of input data transformations was determined for use in the further part of the project:
smoothing of time series using the arithmetic mean M t of order q = 3 , performed every k w = 2 q + 1 time units: t = i k w ,   i = 1,2 , 3 ,
differencing concerning these arithmetic means R t = M t M t k r , performed every k r = k w units of time: t = i k w ,   i = 1,2 , 3 ,
In the program, smoothing is applied to each individual sample, with a time window of 7.
The resulting time series for individual attributes allow for the estimation of trend changes corresponding to transitions between operating states in the refrigeration device’s automaton model.
Let us describe the discretization method in detail. Failure events occurring during the normal operation {S1, S2} manifest as changes in the trends of one or more inputs of the automaton model. These trend changes must be sufficiently pronounced to exceed predefined threshold values. Thresholds are determined separately for each state transition, with those that maximize correct fault detections selected as candidate boundaries of the relevant interval. The final threshold for a given attribute is defined as the minimum absolute difference in the time series that indicates failure, while the second boundary is determined as the maximum absolute difference observed across all examined series for that specific failure. A necessary condition for unambiguous fault detection is that intervals associated with the same attribute across different states remain mutually exclusive. The above operations can be expressed by pseudocode:
1. foreach attribute time series do:
  1.1. Determine disjoint ranges of attribute values that are specific to particular
    states of Moore automaton (overlapping ranges are useless).
  1.2. Assign to those states corresponding combinations of attribute ranges.
In our experimental stand the interval thresholds for the five inputs i1–i5 of the Moore automaton were determined using a dedicated Python-based tool developed for automated analysis of measurement data representing various operational states of the refrigeration system. The results obtained enabled the construction of the corresponding decision table according to the rough set methodology. The final rough set model consists of five states S1–S5, five attributes A1–A5, and five diagnostic decisions D1–D5 (see Table 3).
Table 3. Decision table of the rough set model with discretized input variables (attributes).
Application of the approximate algorithm MV-DRMAX [] resulted in reduction in the attribute set to two {A1, A2}. Then, using the approximate algorithm DT-RULE-GEN [], a set of simple inference rules was generated for the decision table, as shown in Table 4.
Table 4. Decision rules for the prediction and fault detection model with attribute set {A1,A2}.
Here, Ai (j) denotes state j of the attribute Ai. Decisions D(1) and D(2) are associated with standard operating modes (normal operation and defrosting) and serve monitoring purposes. In contrast, decisions D(3), D(4), and D(5) correspond to faults requiring immediate service intervention to restore proper operation of the respective component. The implementation of the designated decision rules is straightforward in any programming language and should not pose any difficulties.
Another reduced set of attributes is {A2, A4}, for which alternative rules for decisions D1–D4 should be generated.

5. Analytical System Test Results

The developed fault detection system was evaluated using historical datasets that had not been utilized during model development, thereby enabling independent verification of its effectiveness. In addition, system validation was conducted on real-time data acquired from a physical refrigeration unit equipped with dedicated measurement loggers. To emulate realistic failure scenarios, different types of faults were deliberately introduced, for example, by employing worn-out components or artificially restricting airflow through fans. In total, 4659 data samples were used for testing the analytical system.
The results of the general detector are presented in Table 5. For each of the three fault types and normal operation, the table reports the number of correct classifications (TP—true positives, TN—true negatives) and misclassifications (FP—false positives, FN—false negatives). The total number of cases is given as Sum = TP + FP + FN + TN. Based on these data, the following performance metrics of the diagnostic system were determined: Accuracy (ACC = (TP + TN)/Sum), Precision (Positive Predictive Value, PPV = TP/(TP + FP)), Recall/Sensitivity (True Positive Rate, TPR = TP/(TP + FN)), Specificity/Selectivity (True Negative Rate, TNR = TN/(TN + FP)), False Positive Error Rate (FPR = 1 − TNR), Positive Likelihood Ratio (LR+ = TPR/FPR), and the Harmonic Mean (F1-score = 2(PPV·TPR)/(PPV + TPR)), which provides a synthetic measure of diagnostic quality [].
Table 5. Classification results of individual fault types by the analytical system based on the rough set method. Total number of samples: Sum= 4659.
Analysis of Table 5 shows that the system’s accuracy ACC lies within the range of approximately 0.76–0.90, reaching its highest values for normal operation states (N = {S1, S2}) and the condensing unit fan failure state S4. For many refrigeration devices, this level of accuracy is considered sufficient. Precision, defined as the consistency of repeated measurements under identical conditions, is highest for states S3, S4, and N, while lowest for state S5, where measurement stability is reduced. This can likely be attributed to the misclassification of the defrosting state S2 as an evaporator fan failure. A potential solution to this issue, which would improve both precision and overall accuracy, as well as directions for future research, are discussed in the Conclusions (Section 7).
Sensitivity, reflecting the system’s ability to correctly identify a fault when it occurs, is high for states N, S4, and S5, but notably low for the compressor failure state S3, where TPR = 0.0954. Across all states, specificity remains relatively high, with the lowest value observed for S5 still exceeding TNR = 0.78 (FPR = 1 − TNR = 0.22).
From the Positive Likelihood Ratio (LR+ = TPR/FPR) column we can obtain information indicating the likely shape of ROC curves. Except for the extreme case of compressor failure (S3, PPV = 1.0), the remaining states N = {S1, S2}, S4, and S5 have the respective LR+ values 4.3188, 15.532, and 3712, all for FPR < 0.5.
The integrated F1 measure, defined as the harmonic mean of precision and recall, provides a consolidated assessment of diagnostic quality. Its values range from the maximum F1 = 1 for compressor failure S3, through high levels for S4 and normal operation N, to a relatively low value for S5, which can be explained by the low precision (PPV = 0.3328).
For comparison, in the RF method [], the metrics for the test data are ACC = 0.89 and TPR = 0.46 but distinguishing between types of failures is not possible. In RST, all TPR values for failure states, except for the S3 state, are significantly higher, ranging from 0.78 to 0.98.
Overall, the test results confirm that it is feasible to construct an operational refrigeration fault detection system using the rough set method. At the same time, the evaluation also revealed specific limitations associated with the application of this approach.

6. Design Methodology for the RS Failure Prediction System

The presented design methodology and the computational results justify a formal generalization in the form of Algorithm 1, which is an improved version of that published in [].
The starting point is a behavioral description of the refrigerating system being the control object and determination of its input and output signals. Then, the system is modeled by a discrete automaton. Continuous output signals are also discretized. The next step is building an appropriate rough set model. Finally, the decision rules determining whether normal or failure states of the system are generated.
The attribute reduction and decision rule generation procedures are based on the original MV-DRMAX and DT-RULE-GEN algorithms, respectively [,]. It should be noted that these algorithms may yield multiple alternative solutions. In such cases, each solution should be evaluated, and the one providing the highest quality of diagnostics selected for practical application. In such cases, each solution should be evaluated, and the one providing the highest quality of diagnostics selected for practical application.
Algorithm 1 RS-FAILURE-PREDICTION-SYSTEM
1: Data: Behavioral description of the system (refrigerator) as a control object, reference signal (setpoint), measured signal (feedback), and other output signals for monitoring state of the object (all signals, except setpoint, are temperatures in the form of time series).
2: Result: Minimal decision rules for automated prediction of possible system failures
3: Identify the control object—determine its transmittance. The time constant must be large enough for efficient fault detection and handling.
4: Build the discrete model of the control system by means of automata.
  4.1. Determine the number of system states.
  4.2. foreach continuous output signal do
    make the analysis using time series techniques (smoothing, differencing) in order to determine:
    4.2.1. transition from initial state to the main operational state (normal work).
    4.2.2. signals trends and boundaries of value ranges that are essential for
     detection of failures (they are inputs for transition functions to the speci-
     fied failure states).
    4.2.3. if extra operational states are also present then they have to be included
     in the model together with their transition functions.
5: On the basis of a discrete model, build appropriate information table (and decision table) for rough set model of the reasoning.
6: Check the discernibility condition.
7. Apply MV-DRMAX algorithm for possible reduction in the model.
8. Apply DT-RULE-GEN algorithm in order to generate minimal decision rules for automated prediction of possible system failures.
9: if multiple solutions are available then select one of them that satisfies secondary criteria, like performance metrics of the diagnostic system.

7. Conclusions and Directions for Future Research

As a result of the conducted research, a predictive–diagnostic system was developed that enables the identification and prediction of the most frequent failures in refrigeration equipment. A key feature of the proposed solution is the use of measurement data collected by low-cost, easily deployable wireless loggers, which significantly lowers the entry barrier for potential large-scale commercial implementations. Considering the high economic costs associated with refrigeration system failures, particularly in critical applications, the system should be prioritized for deployment in units used to store pharmaceutical products, including medicines and vaccines.
The methodology presented for the development of an analytical system for detecting both current and potential anomalies in equipment operation, based on IoT sensor data and rough set theory, is characterized by high versatility and may be adapted to other industrial applications. Effective implementation, however, requires both the appropriate selection of measurement loggers and the adaptation of the decision/diagnostic model to the specific operational characteristics of the monitored devices and the requirements of the target industry.
In conclusion, the initial decision to choose the rough sets methodology is fully justified by the simplicity and low implementation and maintenance costs. In particular, the quantitative measures can be summarized in several points:
The proposed methodology is designed to cater to various refrigeration devices according to the client’s needs.
Refrigeration devices can be modeled by RST in a nonredundant manner with minimal computational resources. Ablation analysis reduced the initial number of system inputs from 8 to 5. After the discretization step using RST, an additional reduction from 5 to 2 attributes was achieved, resulting in very simple decision rules.
The quantitative wrap-up of the system performance is presented in Table 5. Experiments conducted on both real and simulated datasets demonstrated high failure detection effectiveness, with accuracy ACC ranging from 76% to 90% across three most essential fault types. The minimal Positive Likelihood Ratio LR+ =3.71 for failure state S5, grows to LR+ =15.5 for the state S4. For the state S3 PPV = F1 = 1. The values of other statistical measures are provided for detailed comparisons. In the RF method [L], the metrics for the test data are ACC = 0.89 and TPR = 0.46, but distinguishing between types of failures is not possible. In RST, all sensitivity values for failure states, except for the S3 state, are significantly higher, ranging from 0.78 to 0.98.
Future work on the refrigeration fault detection system should encompass the following research areas:
Improvement of detection accuracy (ACC) for compressor as well as evaporator fan failures—the obtained results (Table 5) indicate that the analytical system performs least effectively in predicting and detecting these two types of faults. Misclassifications stem likely from erroneous labeling of the defrosting state as evaporator fan failure and compressor failure as defrosting. A potential solution is a more precise fault classification. It is based on discretization of time series into automata inputs using ranges extrema. Thus, the decision system with the newly assigned attributes better model the device behavior. The proposed improvement would increase the unambiguity of detection while reducing the number of false positives. However, since detection would involve certain attribute values derived from extrema within the corresponding time interval, this approach would extend the fault detection time.
Validation under operational conditions—large-scale testing of the developed solution on a representative group of refrigeration devices operating in real-world conditions. Such studies would account for the impact of the variability of surroundings (e.g., room parameters), diverse storage temperature settings, and differing load levels within refrigeration chambers on detection performance.
Generalization of the solution—extending the applicability of the system to different classes and models of refrigeration units through the adaptation of machine learning algorithms and adjustment of system parameters to the specific characteristics of individual device types.
Cost–benefit analysis and potential migration of the decision model to IoT loggers—the low computational complexity of the developed decision model enables its implementation directly on IoT devices (edge computing). The primary benefit of such an approach is the reduction in communication frequency: data transmission would occur only upon detection of a potential failure (event-driven communication). This leads to optimized energy consumption and extended battery life of the monitoring devices. Additional benefits include increased system reliability and improved scalability through reduced server infrastructure load.
Comparison with other ML methods—the RST method, after the discretization stage, allows for the use of both exact and approximate algorithms for attribute reduction and the generation of decision rules, which are fully deterministic. For simple and medium-sized detection/prediction systems, it provides optimal or near-optimal results with low computational complexity. Randomization in the RF method can also lead to good results. This method seems to be a good candidate for comparisons if different types of devices and failures can be distinguished. However, for instance, ablation analysis, which helps reduce the problem size, must be conducted before applying the RF method; otherwise, the input model remains redundant.

Author Contributions

Conceptualization, Z.K. and P.S.; Methodology, Z.K. and P.S.; Software, B.K.; Validation, P.S. and B.K.; Investigation, P.S. and B.K.; Writing—Original Draft Preparation, P.S., B.K., and Z.K.; Writing—Review and Editing, Z.K.; Funding Acquisition, P.S. and Z.K. All authors have read and agreed to the published version of the manuscript.

Funding

The research and development carried out by Efento Ltd. within the project “Development of a remote system for identification and early prediction of failures (Predictive Maintenance) of refrigeration equipment, based on battery-powered Internet of Things devices and artificial intelligence algorithms” was partially founded by the European Union under the “Regional Operational Programme for the Małopolska Region for 2014–2020) under the “Knowledge Economy” programme”. The research was also founded in part by a grant from Cracow University of Technology, Faculty of Electrical and Computer Engineering, 31-155 Kraków, ul. Warszawska 24.

Data Availability Statement

The code examples and data that support the findings of this study are openly available in GitHub repository https://github.com/bartoszxkozlowski/rough-sets-refrigeration-detection (accessed on 12 September 2025). Further inquiries can be directed to the corresponding author.

Acknowledgments

Statistical data on the failure rates of refrigeration equipment components were provided by Juka []. The experimental testbed was constructed and made available by Efento Ltd. [].

Conflicts of Interest

Authors Piotr Szydłowski and Bartosz Kozłowski were employed by Efento Sp. z.o.o. The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Gertler, J. Fault Detection and Diagnosis in Engineering Systems; CRC Press: Boca Raton, FL, USA, 1998. [Google Scholar] [CrossRef]
  2. Bousdekis, A.; Apostolou, D.; Mentzas, G. Predictive Maintenance in the 4th Industrial Revolution: Benefits, Business Opportunities, and Managerial Implications. IEEE Eng. Manag. Rev. 2020, 48, 57–62. [Google Scholar] [CrossRef]
  3. Melo, A.; Camara, M.M.; Pinto, J.C. Data-Driven Process Monitoring and Fault Diagnosis: A Comprehensive Survey. Processes 2024, 12, 251. [Google Scholar] [CrossRef]
  4. Nunes, P.; Santos, J.; Rocha, E. Challenges in predictive maintenance—A review. CIRP J. Manuf. Sci. Technol. 2023, 40, 53–67. [Google Scholar] [CrossRef]
  5. Jimenez, J.J.M.; Schwartz, S.; Vingerhoeds, R.; Grabot, B.; Salaün, M. Towards multi-model approaches to predictive maintenance: A systematic literature survey on diagnostics and prognostics. J. Manuf. Syst. 2020, 56, 539–557. [Google Scholar] [CrossRef]
  6. Es-sakali, N.; Cherkaoui, M.; Mghazli, M.O.; Naimi, Z. Review of predictive maintenance algorithms applied to HVAC systems. Energy Rep. 2022, 8, 1003–1012. [Google Scholar] [CrossRef]
  7. Ge, M.; Bangui, H.; Buhnova, B. Big Data for Internet of Things: A Survey. Future Gener. Comput. Syst. 2018, 87, 601–614. [Google Scholar] [CrossRef]
  8. Nti, I.K.; Adekoya, A.F.; Weyori, B.A.; Nyarko-Boateng, O. Applications of artificial intelligence in engineering and manufacturing: A systematic review. J. Intell. Manuf. 2022, 33, 1581–1601. [Google Scholar] [CrossRef]
  9. Fernandes, S.; Antunes, M.; Santiago, A.; Barraca, J.; Gomes, D.; Aguiar, R. Forecasting Appliances Failures: A Machine-Learning Approach to Predictive Maintenance. Information 2020, 11, 208. [Google Scholar] [CrossRef]
  10. Fernandes, M.; Corchado, J.M.; Marreiros, G. Machine learning techniques applied to mechanical fault diagnosis and fault prognosis in the context of real industrial manufacturing use-cases: A systematic literature review. Appl. Intell. 2022, 52, 14246–14280. [Google Scholar] [CrossRef] [PubMed]
  11. Carvalho, T.P.; Soares, F.A.; Vita, R.; Francisco, R.D.P.; Basto, J.P.; Alcalá, S.G. A systematic literature review of machine learning methods applied to predictive maintenance. Comput. Ind. Eng. 2019, 137, 106024. [Google Scholar] [CrossRef]
  12. Gupta, S.; Kumar, A.; Maiti, J. A critical review on system architecture, techniques, trends and challenges in intelligent predictive maintenance. Saf. Sci. 2024, 177, 106590. [Google Scholar] [CrossRef]
  13. Li, Z.; He, Q.; Li, J. A survey of deep learning-driven architecture for predictive maintenance. Eng. Appl. Artif. Intell. 2024, 133 Pt C, 108285. [Google Scholar] [CrossRef]
  14. Zachariades, C.; Vigila, X. A Review of Artificial Intelligence Techniques in Fault Diagnosis of Electric Machines. Sensors 2025, 25, 5128. [Google Scholar] [CrossRef] [PubMed]
  15. Adelekan, D.S.; Ohunakin, O.S.; Paul, B.S. Artificial intelligence models for refrigeration, air conditioning and heat pump systems. Energy Rep. 2022, 8, 8451–8466. [Google Scholar] [CrossRef]
  16. Singh, V.; Mathur, J.; Bhatia, A. A comprehensive review: Fault detection, diagnostics, prognostics, and fault modeling in HVAC systems. Int. J. Refrig. 2022, 144, 283–295. [Google Scholar] [CrossRef]
  17. Quispe-Astorga, A.; Coaquira-Castillo, R.J.; Mego, L.W.U.; Herrera-Levano, J.C.; Concha-Ramos, Y.; Sacoto-Cabrera, E.J.; Moreno-Cardenas, E. Data-Driven Fault Detection and Diagnosis in Cooling Units Using Sensor-Based Machine Learning Classification. Sensors 2025, 25, 3647. [Google Scholar] [CrossRef] [PubMed]
  18. CoolAutomation Website, Petah, Tikva, Israel. Available online: https://coolautomation.com (accessed on 12 September 2025).
  19. Demuth, H.B.; Beale, M.H.; De Jess, O.; Hagan, M.T. Neural Network Design, 2nd ed.; Martin Hagan: Stillwater, OK, USA, 2014; ISBN 978-0-9717321-1-7. [Google Scholar]
  20. Pawlak, Z. Rough sets. Int. J. Comput. Inf. Sci. 1982, 11, 341–356. [Google Scholar] [CrossRef]
  21. Kulkarni, K.; Devi, U.; Sirighee, A.; Hazra, J.; Rao, P. Predictive Maintenance for Supermarket Refrigeration Systems Using Only Case Temperature Data. In Proceedings of the 2018 Annual American Control Conference (ACC), Milwaukee, WI, USA, 27–29 June 2018; pp. 4640–4645. [Google Scholar] [CrossRef]
  22. Facchinetti, T.; Arazzi, M.; Nocera, A. Time series forecasting for predictive maintenance of refrigeration systems. In Proceedings of the 2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Falerna, Italy, 12–15 September 2022; pp. 1–6. [Google Scholar] [CrossRef]
  23. Kokosiński, Z.; Jaworski, K. A rough inference software system for computer-assisted reasoning. In Advances in Artificial Intelligence-Based Technologies, Selected Papers in Honour of Professor Nikolaos G. Bourbakis; Virvou, M., Tsihrintzis, G.A., Tsoukalas, L.H., Jain, L.C., Eds.; Springer: Berlin/Heidelberg, Germany, 2022; Volume 1, pp. 59–76. [Google Scholar] [CrossRef]
  24. Kokosiński, Z.; Szydłowski, P. Rough inference system for automated failure prediction in cooling devices. In Proceedings of the 2023 1st International Conference on Optimization Techniques for Learning (ICOTL), Bengaluru, India, 7–8 December 2023; IEEE: New York, NY, USA. [Google Scholar] [CrossRef]
  25. Hartmanis, J.; Stearns, R.E. Algebraic Structure Theory of Sequential Machines; Prentice Hall Inc.: New York, NY, USA, 1966. [Google Scholar]
  26. Mikolajczak, B. (Ed.) Algebraic and Structural Automata Theory; North Holland: Amsterdam, The Netherlands, 1991; ISBN 9780080867847. [Google Scholar]
  27. Zagdański, A.; Suchwałko, A. Analysis and Prediction of Time Series; PWN: Warszawa, Poland, 2022; ISBN 9788301183561. (In Polish) [Google Scholar]
  28. Valenzuela, O.; Rojas, F.; Herrera, L.J.; Pomares, H.; Rojas, I. Theory and Applications of Time Series Analysis and Forecasting. In Selected Contributions from ITISE 2021; Springer: Cham, Switzerland, 2024; ISBN 978-3-031-14199-7. [Google Scholar] [CrossRef]
  29. Barandas, M.; Folgado, D.; Fernandes, L.; Santos, S.; Abreu, M.; Bota, P.; Liu, H.; Schultz, T.; Gamboa, H. TSFEL: Time Series Feature Extraction Library. SoftwareX 2020, 11, 100456. [Google Scholar] [CrossRef]
  30. Rutkowski, L. Computational intelligence. In Methods and Techniques; Springer: Berlin/Heidelberg, Germany, 2008; ISBN 978-3-540-76287-4. [Google Scholar]
  31. Sammut, C.; Webb, G.I. Encyclopedia of Machine Learning and Data Mining; Springer Science+Business Media: New York, NY, USA, 2017; ISBN 9781489976857. [Google Scholar]
  32. Naidu, G.; Zuva, T.; Sibanda, E.M. A Review of Evaluation Metrics in Machine Learning Algorithms. In Artificial Intelligence Application in Networks and Systems. CSOC 2023; Lecture Notes in Networks and Systems; Silhavy, R., Silhavy, P., Eds.; Springer: Cham, Switzerland, 2023; Volume 724. [Google Scholar] [CrossRef]
  33. Juka Website (Niepołomice. Poland). Available online: https://juka.com.pl/en/ (accessed on 12 September 2025).
  34. Juka Vienna Refrigerated Cabinet, Product Presentation, Technical Parameters and Datasheet. Available online: https://juka.com.pl/en/products/refrigerated-cabinets/229-vienna-o (accessed on 12 September 2025).
  35. Rondón, R.; Gidlund, M.; Landernäs, K. Evaluating Bluetooth Low Energy Suitability for Time-Critical Industrial IoT Applications. Int. J. Wirel. Inf. Netw. 2017, 24, 278–290. [Google Scholar] [CrossRef]
  36. Bali, M.S.; Gupta, K.; Bali, K.K.; Singh, P. Towards energy efficient NB-IoT: A survey on evaluating its suitability for smart applications. Mater. Today Proc. 2022, 49 Pt 8, 3227–3234. [Google Scholar] [CrossRef]
  37. Efento Website (Kraków, Poland). Available online: https://getefento.com (accessed on 12 September 2025).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.