Next Article in Journal
Sustainable Biodegradable Waste Management for Circular Economy: Comparative Assessment of Composting Technologies
Previous Article in Journal
City-Level Critical Thresholds for Road Freight Decarbonization: Evidence from EVT Modeling Under Economic Fluctuation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Operation Data Mining and Analysis of VRF Air-Conditioning Systems Based on ARM and MLR Methods to Enhance Building Sustainability

School of Civil Engineering, Zhengzhou University, Zhengzhou 450001, China
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(20), 8974; https://doi.org/10.3390/su17208974
Submission received: 27 August 2025 / Revised: 6 October 2025 / Accepted: 7 October 2025 / Published: 10 October 2025

Abstract

With the increasing intelligence of modern air-conditioning systems, the difficulty of acquiring data from air-conditioning systems has been significantly reduced. However, analyzing the massive amounts of data collected and obtaining more valuable information still remains challenging, especially considering the internal relationships behind the data. The purpose of this study was to conduct operational experiments on VRF systems under different indoor set temperatures, indoor set air speeds, and terminal load rates. Then, the patterns of various operating parameters and energy consumption of VRF systems during winter operation were analyzed based on unsupervised methods. Three machine learning methods were primarily employed in this study, including correlation analysis, data regression analysis, and association rule analysis. Finally, a regression model was constructed for energy consumption based on eight typical characteristic parameters. The experimental results showed that the system was stable to a certain degree at different wind speeds. Among the characteristic parameters, fixed frequency 1 exhaust temperature, compressor frequency, and other parameters have a significant positive effect on energy consumption, while fixed frequency 1 shell top oil temperature, inlet and outlet pipe temperature difference, and other parameters have a negative effect. The research results provide a reference for air conditioning system data mining and building sustainability.

1. Introduction

The increasing demand for smart buildings and energy management has resulted in a substantial proliferation of electrical equipment, leading to a notable escalation in energy consumption and carbon dioxide emissions. The increasingly severe environmental problems have posed enormous challenges to global sustainable energy development. According to relevant research data, the construction industry accounts for approximately 30% of total global energy consumption and has become one of the largest contributors to global energy consumption [1,2,3]. Specifically, the energy consumption of air conditioning systems accounts for 30% to 50% of the total annual energy usage of buildings [4,5,6]. The data highlights the enormous energy demand generated by air conditioning systems.
Compared with other HVAC systems, VRF can reduce greenhouse gas emissions such as carbon dioxide. With the integration of more renewable energy into air conditioning systems, researchers have demonstrated the energy-saving potential of VRF systems by comparing them with traditional HVAC systems through model simulations and field performance tests in various case studies [7]. The performance of traditional air conditioning and VRF systems was evaluated using EnergyPlus simulation by Aynur et al. [8]. The results showed that VRF systems can save 27% to 57% energy compared with traditional HVAC systems. The study also used EnergyPlus simulations to explore the energy-saving potential of VRF systems compared with traditional HVAC systems across various U.S. climates. Kim et al. [9] conducted relevant research and the results showed that VRF systems can save 15% to 42% energy with lower operating costs compared to standard rooftop units (RTUs), except in cold climates. Emrah et al. [10] demonstrated that implementing VRF systems can decrease energy consumption by up to 44% in commercial buildings compared to chillers with boilers. Therefore, VRF systems are more energy efficient and reduce carbon emissions more effectively than traditional HVAC systems.
However, the multisource heterogeneous data such as energy consumption, environmental parameters and equipment status, generated during the operation of VRF systems exhibit characteristics of high-dimensionality, nonlinearity, and dynamicity. The traditional analysis method is difficult to effectively uncover their hidden patterns, leading to bottlenecks in system optimization, fault prediction, and energy efficiency management. Meanwhile, data mining techniques, which extract implicit patterns from massive datasets, have demonstrated powerful analytical capabilities in healthcare, finance, and other domains. However, their application in VRF systems remains largely underexplored, with systematic investigations still constituting a research gap.
As a critical sector for achieving the dual-carbon goals, energy efficiency in building air conditioning has made remarkable progress in recent years through technological innovation, policy-driven initiatives, and market practices. In the field of energy-efficient building air conditioning, numerous technologies have been developed and innovated, such as intelligent control technologies, variable frequency technology, heating and cooling recovery technologies, as well as applications of various new energy sources in building air conditioning [11]. Liu et al. [12] designed and developed a PSF centrifugal compressor, achieving an adiabatic efficiency of 75.9–88.5% for the compressor. In terms of new energy applications, renewable energy sources such as geothermal and solar energy have been increasingly applied in HVAC design. For example, solar energy is used to construct heat pump systems, thereby reducing the dependence of HVAC systems on electricity.
Darwiche et al. [13] used geothermal energy as backup energy to operate a typical All-Air centralized HVAC system and found that introducing 100% fresh air intake ultimately saved 67% energy annually. Technologies such as fresh air precooling, solar photovoltaic power generation, and reclaimed water reuse can reduce reliance on traditional energy sources. Ra et al. [14] applied solar cell storage-integrated switchable glass topology to provide passive HVAC during the day in EV charging station control rooms. Wang et al. [15] developed a high-performance and compact counter-flow indirect dew point evaporative cooler, which eliminated the working air flow reversal in traditional M-cycle coolers, thereby reducing pressure drop and improving energy efficiency. The research results confirmed that cooler is suitable as an efficient precooling device. Shyu et al. [16] designed a system incorporating an anaerobic membrane bioreactor, which was completely powered by solar photovoltaics during the test period. The high-quality product water that meets local water recycling standards was produced using membrane and ion exchange treatment. All of the above provide effective solutions for energy consumption.
With the application of automation systems and the Internet of Things (IoT) in buildings, sensors are used to collect vast amounts of data reflecting system operations, which hide abundant useful information and knowledge [17]. Furthermore, data mining (DM) technology has demonstrated its powerful capability to automatically analyze big data across various domains [18]. Data mining techniques are usually categorized as either supervised or unsupervised learning. Supervised learning is suitable for regression and classification modeling due to its strong mapping approximation capability. Unsupervised learning is suitable for discovering new information and knowledge by exploring correlations, associations, and patterns in data [19]. The proliferation of big data alongside the advancement of data mining technologies offers a viable pathway for the operational assessment of HVAC systems.
Unsupervised data mining has currently gained extensive application in building data analysis to extract more valuable information from actual operational data. Tian et al. [20] put forward an unsupervised data mining framework for the evaluation and optimization of HVAC system operation strategies, which realized an energy consumption reduction rate of 6.9%. Wang et al. [21] introduced a methodology for interpreting neural network models by leveraging model gradients. This method quantifies the marginal impact of input on output based on chain rules, reducing computation time by 40% without sacrificing model accuracy. Xu [22] put forward a data mining-driven approach for anomaly detection and dynamic energy performance assessment of HVAC systems. Qian et al. [23] conducted data mining on a large-scale dataset based on VRF big data to determine the actual behavior and energy consumption of residents in different climate zones. The data mining technology, including data preprocessing, cluster analysis, association rule mining, and post-processing, were fully utilized in the study to achieve dynamic multi-level energy consumption assessment of HVAC systems. However, more attention should be paid to direct variables that reflect the operational status such as temperature, flow rate, wind speed and pressure when using data mining methods to optimize the operation of the system. The heating and cooling load requirements of buildings can be predicted by utilizing big data analysis and machine learning algorithms. The intelligent control system can automatically adjust air conditioning parameters such as operation modes, temperature settings, and wind speed based on these predictions to achieve precise temperature control, avoiding over-cooling or over-heating. This can reduce energy consumption and improve system efficiency.
The data correlations among high-dimensional operational variables contain more hidden information, which is more helpful for evaluating system operational status. Yu et al. [24] explored a novel methodology to analyze all associations and correlations within building operational data, thus discovering valuable insights for energy conservation. Xiao et al. [25] put forward a practical framework for mining Building Automation System datasets with data mining techniques. The framework consists of five main steps. They are data preparation, cluster analysis, association rule mining, post-mining processing, and application of discovered knowledge. This framework serves as a critical link between BAS monitoring data and actionable HVAC operation strategies. Li et al. [26] devised a data mining-oriented approach to identify and interpret power consumption patterns and correlations, analyzing three time-independent influencing factors including part-load ratio, refrigerant charge level, and cooling conditions. The findings indicated that this approach can assist in identifying energy consumption patterns and extracting energy consumption rules within VRF systems. Here, energy consumption rules refer to stable, physically interpretable associations between operating parameters and energy consumption. For example, the increase in part-load ratio leads to a linear rise in compressor frequency and thus energy consumption under fixed refrigerant charge. Li et al. verified these rules by partitioning VRF operation data into different load intervals and using association rule mining to screen valid parameter-energy correlations. These studies employed advanced data mining methods to conduct comprehensive analyses of system operational data, thereby identifying several optimization directions. However, data mining strategy evaluation and optimization aim to improve efficiency. Therefore, a reasonable baseline should be extracted to determine whether set operational variables lead to inefficiencies. Wu [27] proposes recommendations for enhancing the energy efficiency standards of VRF equipment by collecting and pre-processing data, extracting operating parameters, conducting data analysis and comparisons, and discussing potential applications. Zhou [28] proposed a research methodology encompassing data preparation, feature selection, and VRF system energy consumption prediction to evaluate the generalization ability of the prediction model. This further confirmed the effectiveness and reliability of the methodology in energy consumption prediction for VRF air conditioning systems. However, the issue of correlation between variables has not been taken into account. To address the gaps in existing studies, this study designs two core sections to systematically implement the VRF system data mining, with clear inheritance and innovation from the aforementioned literature. In Section 2, three key variables were controlled, and high-precision sensors were used to collect data, ensuring the dataset covers typical winter heating scenarios for VRF systems in central China. A multi-layered analytical framework was built in Section 3. First, 11 parameters significantly correlated with energy consumption were selected from the initial set of 24 parameters through correlation analysis. Subsequently, MLR was employed to quantify linear relationships, and finally, ARM was utilized to uncover non-linear patterns under various operating conditions. This combination of methods ensures that the study not only inherits the advantages of existing research but also targets the characteristics of VRF winter heating systems to improve the comprehensiveness and practicality of results.
Additionally, the selection of feature variable sets exerts a substantial influence on data mining. Relevant analytical methods were applied in the study to eliminate redundant variables from the original feature set and prevent high correlations between the original variables. Data mining techniques were employed to optimize feature variable selection [29,30]. Data mining is the complex process of discovering hidden knowledge within large datasets [31,32,33]. Association rules are a technique used in data mining to identify potential relationships between variables or items within a dataset. They are often used to analyze transactional data. Data mining techniques and association rule mining [34,35,36,37] were widely used in this study to discover defects related to variables. Current studies mostly focus on single-parameter analysis of VRF systems such as energy consumption prediction or fault diagnosis, while few studies focus on systematic mining [38,39,40] of multivariate correlations and dynamic evolution patterns. Although machine learning [41,42] based methods perform well in classification and regression tasks, the continuity and spatial correlation of time series data have not been fully utilized. Traditional statistical methods have difficulty dealing with non-linear relationships.
To address the above gaps and given the complexity of VRF system operating scenarios and the difficulty of grasping actual operating conditions, this study selected data measured by a VRF system experimental platform and performed data mining analysis on the experimental data to derive the operating parameters and their impact on energy consumption. By adjusting the experimental settings, various operational parameters and energy consumption data during VRF system operation were measured, and data mining was performed on the collected data. The effects of variations in parameters such as system temperature, compressor frequency, EXV steps, average pressure and load on the operation of VRF system were analyzed, as were the effects of indoor set temperature, set air speed and load rate. The R software package, as well as the SPSS Statistics 27 and SPSS Modeler.v18 software packages, were used as data mining tools in this study. Correlation analysis was conducted via R software (R-4.5.1-win), multiple linear regression was applied for regression analysis, and the Apriori algorithm was utilized for association rule mining. This study combines multiple linear regression (MLR) with association rule mining (ARM). MLR is used to quantify the linear relationship between eight key feature parameters and energy consumption, while ARM is used to mine hidden association rules under different operating conditions based on the non-linear and dynamic characteristics of multi-source heterogeneous data in VRF systems, thereby deriving the relationship between various operating parameters and energy consumption. The essence of this combination is to address the inherent limitations of these two types of methods, directly serving the overarching goal of providing actionable optimization strategies for the winter heating operation of VRF systems in Zhengzhou. When MLR is applied alone, although it can output quantitative coefficients, it fails to reflect differences in operating conditions, resulting in fixed coefficients that are unable to guide dynamic adjustments. In contrast, when ARM is applied alone, it can mine qualitative rules but cannot quantify specific values, leaving the regulations at a descriptive level and making it difficult for engineers to set precise targets. Furthermore, the MLR and ARM collaborative framework is not presented as an isolated concept but is deeply integrated into each core chapter of the manuscript. Without combining MLR and ARM, the experiment would neither output the precise numerical values required for engineering practice nor capture the dynamic rules across different operating conditions, and the research objectives would be entirely unachievable. The innovation of this study lies in achieving the complementarity between quantification and qualification through the synergy of MLR and ARM. Specifically, MLR first quantifies the intensity of core impacts, providing an accurate numerical benchmark for engineering adjustments. Subsequently, ARM explores the dynamic deviations under different operating conditions. This logic of first quantifying the benchmark and then correcting for bias represents an achievement previously unattained by any single research methodology. It not only addresses the issue of poor adaptability of MLR to different operating conditions but also compensates for the limitation of ARM being unable to quantify impact intensity.
The contributions of this study are as follows.
(1)
The MLR quantitative model, combined with the ARM dynamic rules system pro-posed in this study, is fully integrable into VRF real-time control systems in terms of technical adaptability and engineering implementability.
(2)
The research findings are expected to provide key technical support for achieving energy conservation, smart city, and carbon-neutral sustainable development goals in various regions, particularly in Central China. The research conclusions directly serve to improve the sustainability of buildings in the region and provide localized data support for climate-adaptive VRF operating strategies.
(3)
This study takes a single building in Zhengzhou as the starting point, aiming to establish a methodological benchmark for in-depth analysis. In subsequent work, through the expansion to multiple buildings, multiple climate zones, and large-scale datasets, the limitations caused by a single scenario will be gradually eliminated. Ultimately, a VRF system optimization framework that combines robustness and practicality will be developed.

2. VRF System Operation Experiment

The data from an experiment on the operational energy consumption of VRF systems in office buildings was used in this study. The building is located in a six-story office building in central China. The scope of this experiment includes the laboratories and offices on each floor, including those on the sixth floor. The building features a floor height of 3.6 m, and the area of a single monitoring room is approximately 103.4 square meters. This study primarily selects experimental data from experiments involving the activation of various end devices and the setting of different indoor temperatures and air speeds to explore the operational laws of VRF system in groups. The data for this study were sourced from an office building in Zhengzhou. The conclusions are highly applicable to office buildings in cold regions but require further validation in climatic zones characterized by severe cold, hot summers, and mild winters, as well as in residential and commercial buildings. For scenarios with severe load fluctuations or extreme climates, it is recommended to incorporate the non-linear rules mined by ARM for supplementary adjustments, thereby avoiding application deviations caused by model assumptions.
The experiment was conducted under winter conditions, with the air conditioning operation mode set to heating mode, using a VRF system. The outdoor unit is mounted on the rooftop of the office building. Four indoor units are placed in two rooms, with Terminal 1 and Terminal 2 in one room, and Terminal 3 and Terminal 4 in the other. Different controllers independently control the set parameters of the four indoor units. Figure 1 shows a site diagram of the VRF system, and Figure 2 shows a flow chart of the VRF system.
The VRF system is primarily composed of EXVs, compressors, four-way reversing valves, indoor and outdoor units, indoor and outdoor fans, temperature sensors, and pressure sensors. The compressor compresses the gaseous refrigerant into high-temperature and high-pressure gas and delivers it to the indoor unit under the heating mode. The temperature sensors are installed at the indoor and outdoor units, as well as at the inlet and outlet of the compressor, to monitor parameters such as fixed and variable frequency discharge temperature, fixed-variable frequency shell top oil temperature, inlet–outlet pipe temperature, and real-time high pressure at an hourly time scale. Meanwhile, a smart meter is mounted to measure the active power, total active power, bidirectional active power, as well as the current and voltage values of the entire system. The data collection interval is 3 s. The experiment adhered to the single-variable principle, striving to minimize interference from other factors on the experimental results. In experiments involving a variable room set temperature and a variable number of open indoor unit ends for multiple units, the wind speed was set to high. During the variable wind speed test, the quantity of operational indoor unit terminals and the set room temperature were maintained constant.
Before the experiment, windows and doors were opened to balance the indoor and outdoor temperatures. The experiment was officially initiated only after the indoor and outdoor temperatures had reached consistency. The indoor environment was kept closed while activating the indoor unit terminals of the VRF system during the experiment. After 15 min of stable operation, the terminals were turned off, and this process was repeated to obtain data under different operating conditions. In the experiment involving varying the number of VRF indoor unit terminals, each operating condition lasted approximately 15 min. The VRF system was not shut down when switching between conditions, ensuring the continuity of the experimental process. Experimental parameters were adjusted by controlling terminal activation combinations and indoor set temperatures. Specific experimental conditions and terminal activation settings are listed in Table 1, while variable wind speed experiment setups are shown in Table 2. The core objective of this study is to analyze the impact of different operating parameters on the energy consumption and key parameters of VRF systems. To achieve this goal, the experimental temperature settings need to cover a low, medium, and high gradient to capture the dynamic response of the system under different load demands. 18 °C represents the lower temperature for winter heating. 24 °C corresponds to the typical range of human comfort temperatures in winter. 30 °C represents a relatively high temperature setting or a test for the system’s adjustment capability under extreme operating conditions. Through this gradient, it is possible to systematically observe the non-linear changes in parameters such as compressor frequency, refrigerant flow, and heat exchange capacity of the VRF system as the set temperature increases, as well as how these changes affect energy consumption. Although 30 °C is uncommon in conventional winter heating, the necessity of setting this temperature is reflected in two aspects. (1) Under high temperature settings, VRF systems may enter high load operation mode, which differs significantly from medium and low temperature settings in terms of energy consumption characteristics and parameter correlation. By identifying patterns under these operating conditions, data can be provided to support energy efficiency optimization for the system under extreme demand, thereby avoiding efficiency degradation caused by long-term high-load operation. (2) This study was conducted in Zhengzhou, a city located in the central region of China. There is no centralized heating in this region during winter, and some buildings rely on VRF systems for independent heating. Some users may set high temperatures for short periods due to instantaneous heating needs in actual applications. Therefore, studying the 30 °C operating condition has practical significance for guiding users to adjust temperatures reasonably.

3. Research Methods for VRF System Operation

This study aims to utilize data mining technology to uncover the intrinsic relationship between operating parameters and energy consumption of VRF air conditioning systems, providing a theoretical basis and practical guidance for optimizing system energy efficiency. In response to the research objectives and characteristics of VRF system operating data, three methods were mainly employed in this study, including correlation analysis, MLR, and ARM. A multi-layered analytical framework is constructed, encompassing parameter screening, quantitative modeling, and pattern discovery. The combined application of the three methods not only quantifies the linear relationships between variables but also captures non-linear correlation patterns, ultimately providing a comprehensive analysis of the energy consumption characteristics of VRF systems. Figure 3 shows a framework diagram of this study.

3.1. Fundamentals of Correlation Analysis

Multiple methods can be employed for correlation analysis. The most commonly used one is Pearson correlation coefficient. As a prevalent approach in correlation analysis, the Pearson correlation coefficient is designed to measure the linear correlation between two continuous variables. Its value ranges from −1 to 1, where −1 indicates a perfect negative correlation, 0 denotes no correlation, and 1 represents a perfect positive correlation. The calculation formula is as follows.
r = ( X i Y i n X b a r Y b a r ) X i 2 n X b a r 2 × Y i 2 n Y b a r 2
where X i and Y i are the X and Y values in the i-th sample, n is the sample size, and X b a r and Y b a r are the mean values of X and Y .
The core assumptions of Pearson correlation analysis includes four aspects: variables being continuous, linear relationships existing between variables, data approximating a normal distribution, and the absence of significant outliers. The verification process for these assumptions is as follows.
(1)
Continuous variable validation. All variables included in the analysis are continuous numerical data, not categorical or discrete variables, and fully satisfy the continuous assumption.
(2)
Linear relationship verification analysis. Before the analysis, the relationship between variables was examined using a scatter plot matrix. The focus was on the scatter distribution of each parameter and energy consumption, and it was found that the fixed frequency 1 exhaust temperature and energy consumption, compressor frequency and energy consumption, etc., showed a clear linear trend. For variables that may exhibit non-linear relationships, the correlation coefficient (r = 0.040) indicates a weak linear association. Further analysis of their non-linear patterns will be conducted in subsequent association rule mining to avoid the limitations of relying solely on Pearson correlation.
(3)
Normality test for variables. Most variables had p values > 0.05, indicating approximate normal distribution. A few variables showed slight skewness, but due to the large sample size in this study, according to the central limit theorem, slight skewness had little effect on the Pearson correlation coefficient, and the results remained robust.
(4)
Outlier handling. Outliers are identified using box plots and handled as follows. Outliers caused by instantaneous fluctuations in sensors are replaced by interpolation of adjacent data. Transition data during system startup and shutdown phases are directly discarded to ensure that the analysis data comes from the stable operation phase of the system.

3.2. Theoretical Basis of Regression Analysis

The core idea of regression analysis is to construct a function to describe the relationship between variables. Classified by the number of variables, regression analysis can be divided into simple regression and multiple regression. Among these, multiple regression is more suitable for analyzing multi-factor correlations in complex problems. Its ultimate goal is to establish a model capable of predicting outcomes accurately. Meanwhile, statistical indicators such as R-squared and sum of squared errors need to be used to test the model’s accuracy, so as to ensure the reliability of prediction and analysis. The mathematical model of multiple linear regression is as follows.
y = β 0 + β 1 x 1 + β 2 x 2 + + β n x n + ε
where y denotes the dependent variable, x 1 , x 2 , …, x n represent the independent variables, β 0 , β 1 , β 2 , …, β n are the regression coefficients, ε signifies the random error term.

3.3. Methodological Basis for Association Rule Analysis

Association rule analysis is a commonly used technique in the field of data mining, whose core purpose is to explore the associative relationships between different attributes in a dataset, thereby providing support for decision-making and problem-solving. Its key concepts include support and confidence, and it is typically implemented using the Apriori algorithm. This algorithm is based on the assumptions that subsets of frequent itemsets must be frequent and subsets of non-frequent itemsets must be non-frequent. It first scans the dataset to generate candidate frequent itemsets, then filters out those that meet the minimum support threshold to obtain valid frequent itemsets. This process is iterated to generate high-order frequent itemsets, with the goal of mining frequent itemsets and association rules from large-scale data. The iteration continues until no new association rules can be generated.
For VRF operating data, association rule analysis was conducted in this study based on three operating conditions, including variable set temperature, variable load rate, and variable wind speed. Combining the probability of occurrence of subsequent items, first-order and second-order behavior analysis will be selected, with a focus on association rules where the number of preceding items is 1.

4. Analysis Results of VRF Operation Data

The results of VRF operation data analysis in this section focus on the unique data characteristics of the winter heating operation of VRF systems in Zhengzhou. The valid data covers three temperature gradients (18 °C, 24 °C, and 30 °C) and records the relationships between 11 core parameters and real-time energy consumption, presenting typical features of strong parameter coupling and high operating condition dynamics. In response to these characteristics, the analysis in this section can quantify the independent impact intensity of core parameters on energy consumption, verify the applicability of the MLR model in winter heating scenarios, and identify differences in the correlation between parameters and energy consumption under different operating conditions. This section strictly follows the progressive logic of parameter screening, model construction, and operating condition verification, forming a closed loop with the research methods presented in Section 3. In Section 4.1, 11 variables significantly correlated with energy consumption were screened out from 24 initial parameters, which directly simplified the variables of the MLR model. In Section 4.3, with 11 parameters as input and energy consumption as output, an 8-parameter model was finally obtained by eliminating variables with multicollinearity. The fundamental purpose of this section is to provide support for Section 5 and Section 6 through verifiable and quantifiable data analysis.

4.1. Correlation Analysis

Given the numerous operational parameters of VRF systems, it is necessary to first screen out those significantly correlated with energy consumption for further analysis. This study initially selected 24 characteristic variables for correlation analysis, using experimental data collected when terminal device 1 was continuously activated for four hours while other devices were turned off. The correlation analysis was performed using R software (R-4.5.1-win), with specific results shown in Figure 4. The analysis results are presented as a circle plot, where the color of each circle reflects the correlation coefficient. Red denotes a high negative correlation coefficient, blue represents a high positive correlation coefficient, and lighter hues indicate that the correlation coefficient is close to 0.
Figure 4 shows the correlation and significant strength of the 24 variables with energy consumption. The results show that 11 characteristic variables are correlated with energy consumption: fixed frequency discharge temperature, fixed frequency 1 shell top oil temperature, fixed frequency 2 shell top oil temperature, average low pressure, compressor frequency, instant output capacity, electrical box temperature, inlet pipe temperature, outlet pipe temperature, ambient temperature, and EXV step count. Among them, the significance of average low pressure, the electrical box temperature, and the EXV steps is less than 0.01, and their correlation is strong. The fixed frequency discharge temperature, the fixed frequency 1 shell top oil temperature, the fixed frequency 2 shell top oil temperature, the inlet pipe temperature, the outlet pipe temperature, and the ambient temperature exhibited extremely high statistical significance (p < 0.001) and extremely strong correlation. Due to limitations in the comprehensiveness and duration of the selected dataset, it may not cover all parameters related to energy consumption. The data in this study is sourced from the Central China region. The results of data analysis have important guiding value for the Central China region. Zhengzhou is located in a cold area, and the experiment was conducted under winter heating conditions, which are relatively consistent with the experimental conditions in this region. The climatic conditions in other regions are different, but the principles of air conditioning components remain the same. The results in Figure 4 also have the same reference significance for other regions.

4.2. Energy Consumption Regression Analysis

In the linear regression analysis conducted in this chapter, the dependent variable is energy consumption. The independent variables were selected as the 11 characteristic variables with significant correlation shown in Figure 4 of Section 4.1. A regression model was constructed using these variables. Model specifications are presented in Table 3 and Figure 5.
It is found that the VIF values for Fixed Frequency 1 Shell Top Oil Temperature, Compressor Frequency, Instantaneous Output Capacity, Inlet Pipe Temperature, and Outlet Pipe Temperature all exceeded 10, with tolerance values less than 0.1, indicating that the model has multicollinearity problems. Multicollinearity can lead to issues such as unreliable regression coefficients, variance inflation, reduced model interpretability, and model instability. Additionally, in this model, the p-values for Compressor Frequency, Instantaneous Output Capacity, and Electrical Box Temperature were greater than 0.05, indicating that the model is invalid and lacks statistical significance.

4.3. Multicollinearity Analysis

During model validation, a high correlation coefficient was found between Compressor Frequency and Instantaneous Output Capacity, along with overlapping physical meanings. Therefore, a collinearity diagnosis was performed on these variables. A linear regression analysis was conducted with energy consumption as the dependent variable and compressor frequency and instantaneous output capacity as the independent variables. Condition indicators are calculated based on characteristic values. The condition indicator for a certain dimension is the ratio of the characteristic value of that dimension to the characteristic value of one dimension. Typically, when the condition index exceeds 15, a collinearity issue may exist in that dimension. If the variance proportion of a variable exceeds 0.9, collinearity between variables may be suspected. As shown in Table 4 and Figure 6, in the third dimension, both compressor frequency and instantaneous output capacity exhibit a variance proportion of 1, exceeding the threshold of 0.9. This indicates collinearity between the two variables. Real-time output capacity is a derived indicator and verification variable calculated based on parameters such as frequency and temperature, which determine whether the current unit is operating at its predetermined capacity. Unlike compressor frequency, which is a core parameter for system operation, real-time output capacity is not a core parameter for system operation. Therefore, real-time output capacity has been removed.
By examining the model through Figure 4, it can be further observed that the correlation coefficients among the fixed-frequency 1 shell-top oil temperature, fixed-frequency 2 shell-top oil temperature, and fixed-frequency 1 discharge temperature are relatively high, with a strong correlation existing among them. Therefore, collinearity diagnosis should be conducted for these three characteristic variables. Consequently, a collinearity diagnosis was performed on these three characteristic variables. The linear regression analysis was performed with energy consumption as the dependent variable and Fixed Frequency 1 Shell Top Oil Temperature, Fixed Frequency 2 Shell Top Oil Temperature, and Fixed Frequency 1 Discharge Temperature as the independent variables. As shown in Table 5 and Figure 7, within the fourth dimension, the variance proportions of both Fixed Frequency 1 Shell Top Oil Temperature and Fixed Frequency 2 Shell Top Oil Temperature exceed 0.9. This suggests collinearity between the two variables. The fixed-frequency 1 compressor is the main compressor for system operation. The fixed-frequency 1 shell top oil temperature is highly correlated with the fixed-frequency 2 shell top oil temperature, and its shell top oil temperature better reflects the thermal state of the core equipment. Therefore, the fixed-frequency 2 shell top oil temperature has been removed.
Meanwhile, the inlet and outlet temperatures were found to be highly correlated and to have similar meanings. A linear regression analysis was performed with energy consumption as the dependent variable and these two variables as the independent variables. As shown in Table 6 and Figure 8, within the third dimension, the variance proportions of both the inlet pipe temperature and the outlet pipe temperature exceed 0.9, indicating the presence of collinearity. The inlet and outlet temperatures are collinear, and the inlet temperature more directly reflects the initial state of the refrigerant entering the heat exchanger. Therefore, combining the two into the inlet–outlet temperature difference eliminates collinearity while retaining physical meaning and maintaining data stability and reliability.
Based on the collinearity diagnosis results, the independent variables were reorganized, and a new regression model was established with energy consumption as the dependent variable. As shown in Table 7 and Figure 9, the VIF values of the eight adjusted characteristic variables are all less than 10, and their tolerance values exceed 0.1. This indicates the absence of multicollinearity among the variables. Additionally, all significance values are less than 0.05, confirming the statistical significance of the model. The model has an R2 value of 0.925, indicating that Fixed Frequency 1 Discharge Temperature, Fixed Frequency 1 Shell Top Oil Temperature, Average Low Pressure, Compressor Frequency, Electrical Box Temperature, Inlet–Outlet Temperature Difference, Ambient Temperature, and EXV Step Count can explain 92.5% of the variation in energy consumption. A model is valid if its explanatory power exceeds 30%. Additionally, the table indicates that Fixed Frequency 1 Discharge Temperature, Compressor Frequency, Ambient Temperature, and EXV Step Count have a significant positive impact on energy consumption, meaning higher values of these parameters are associated with higher energy consumption. The results also show that Fixed Frequency 1 Shell Top Oil Temperature, Average Low Pressure, Electrical Box Temperature, and Inlet–Outlet Temperature Difference have a significant negative impact on energy consumption, meaning lower values of these parameters are associated with higher energy consumption.
Based on the results of multicollinearity diagnosis, highly collinear variables were eliminated. This equation includes 8 core independent variables, such as discharge temperature and shell top oil temperature, with energy consumption as the dependent variable. A multiple linear regression equation establishing the relationship between energy consumption and core operating parameters is presented as follows.
C e = 527.121 + 0.688 T f f 1 0.538 T s t o 0.274 P l a + 0.019 F c 0.174 T b o x 0.053 Δ T + 0.074 T e + 0.004 K e x v

4.4. Analysis of Regression Equation Results

To further validate the reliability and applicability of the regression model, it is necessary to analyze variable independence, collinearity, and residual distribution characteristics, ensuring the model meets statistical assumptions and can effectively explain the laws of energy consumption changes. The specific analyses are as follows.
(1) Assessment of variable independence
The stability of a regression model depends on the independence of the independent variables. In this study, the Durbin–Watson (DW) statistic was used to test the serial correlation of the variable sequence. The result showed that DW = 1.03, which, although deviating from the ideal value of 2.0, indicates the presence of a slight positive correlation. This can be attributed to the temporal continuity of the parameters of the VRF system, which is consistent with the dynamic characteristics of the HVAC system. The DW value did not fall into the strong correlation interval (DW < 1.0). Additionally, the absolute values of the autocorrelation coefficients of the model residuals were all less than 0.2. These results indicate that the sample independence basically meets the requirements of regression analysis, and the bias of parameter estimation is controllable.
(2) Quantitative diagnosis and elimination effect of multicollinearity
Multicollinearity can inflate the variance of regression coefficients. In this study, diagnosis was performed using the VIF and tolerance to address this issue. For the optimized model, all independent variables exhibited a VIF of less than 10 and a tolerance of greater than 0.1 (as shown in Table 7), which meet the critical threshold requirements. Among them, the fixed frequency 1 exhaust temperature (VIF = 6.709) and the fixed frequency 1 shell top oil temperature (VIF = 9.294) VIF are close to the critical value. By eliminating highly collinear variables, the stability of the model parameter estimation was improved. For instance, the standardized coefficient of the EXV step count (beta = 0.026, t = 4.621, p < 0.001) aligns with the physical meaning and engineering practice. This result confirms the effectiveness of collinearity elimination, indicating that the new model is free from multicollinearity issues.
(3) Normal distribution of residuals
Residual testing is crucial for evaluating the validity of a regression model. A non-normal distribution of residuals may lead to model bias. In this study, the residual distribution was visually inspected using a normal probability plot (see Figure 10). Residuals that follow a normal distribution ensure the randomness of prediction errors, which aligns with statistical assumptions and thus guarantees the model’s reliability.
Regarding the interaction effects between variables, this study did not include them in the model at the initial stage for the following main reasons. First, the core of this study is to identify the key parameters that exert a significant impact on energy consumption, thereby providing clear variables for the basic optimization of VRF systems. In contrast, interaction effect analysis is more suitable for exploring the synergistic effects between parameters. Second, the 24 initial variables result in a relatively high dimensionality, and the introduction of interaction terms would significantly increase model complexity, potentially leading to overfitting.
As shown in Figure 10, the residuals of this model do not conform to a normal distribution. Further analysis can obtain the following results. (1) The model explains 92.5% of the variation in energy consumption. However, there are still potential influences that have not been included, such as dynamic boundary conditions like solar radiation intensity and indoor occupant density. These could lead to systematic errors being included in the residuals. (2) The non-linear operating characteristics of the VRF system such as compressor start-stop transients and dynamic adjustment of the electronic expansion valve result in energy consumption fluctuations at certain sample points, which are difficult to portray fully using a linear model. This manifests as an increase in residual dispersion.
The model exhibits deviations in predictions under extreme load conditions and transient compressor start-stop conditions. Additionally, it fails to incorporate the specific impacts of parameter interaction terms and dynamic boundary conditions. Despite these limitations, the model’s high R2 value, significant F-test result, and physically reasonable parameter signs indicate that it can effectively reveal the linear correlation between energy consumption and the core operating parameters of a VRF system. It can therefore be used as a basic analytical tool for optimizing energy efficiency. The linear regression coefficients of MLR have intuitive physical meanings, and multiple linear regression yields linear formulas and explicit relationships between variables, making it more suitable for practical applications. In contrast, even though more advanced non-linear methods may have slightly higher prediction accuracy, their black-box nature prevents them from outputting such quantitative coefficients, and thus, they cannot clarify the specific intensity of the impact. Therefore, multiple linear regression is selected as the core prediction model. Figure 11 shows the regression equation fitting graph generated based on the data. The plot indicates that the fixed-frequency compressor 1 discharge temperature, electrical box temperature, ambient temperature, and compressor frequency exhibit favorable fitting performance, with data points closely clustered around the regression lines. Thus, it can be inferred that the low goodness-of-fit of the regression equation is related to the data dispersion of fixed-frequency compressor 1 shell top oil temperature, average low pressure, inlet–outlet pipe temperature difference, and EXV step count.

5. Association Rule Analysis of Variable Refrigerant Flow System Operation Data

Since the experimental raw data is continuous, it cannot be directly used for association rule mining and needs to be discretized first. Data discretization involves converting continuous data into discrete data. Data discretization can simplify data noise, reduce dataset complexity, and facilitate data processing by algorithms. The Apriori algorithm was employed for association rule analysis. This algorithm first scans the dataset to calculate the support and confidence of all possible rules, then filters the rules based on preset thresholds, and recursively generates rules until no new rules can be derived. The objective of this study is to improve the operational efficiency of VRF air conditioning systems and the sustainability of buildings. Therefore, this study predefines the system energy consumption index as a key outcome variable. Unlike traditional ARM, which indiscriminately explores all possible correlations, the ARM analysis focuses specifically on mining rules related to these predefined result variables. By pre-setting these outcome variables, this study guides the ARM algorithm to discover association rules that can explain or predict these specific targets. This design makes ARM analysis more targeted and practical, thereby better serving the research goal of improving building sustainability.

5.1. Data Discretisation Pre-Processing

Since the original experimental data is continuous, it cannot be directly used for association rule mining. Therefore, data needs to be discretised before mining association rules. Data discretization refers to the process of converting continuous data into discrete data. Its purpose is to simplify noise in the data, reduce the complexity of the dataset, and enable algorithms to process the data more effectively. By converting continuous data into several discrete values, the impact of noise on data analysis can be reduced.
The data discretization commonly uses two methods, including the equal-frequency discretization and the equal-distance discretization. Equal-frequency discretization is the process of dividing continuous data into k discrete intervals, each containing an equal number of data points. Equal-distance dispersion is the process of dividing continuous data into k discrete intervals, each with equal width, although the number of data points within each interval may vary. Firstly, determine the maximum and minimum values of the data. Divide the data into k discrete intervals, with each interval having a width of (max − min)/k. Then, determine the boundaries of the intervals based on the width and minimum value of each interval. For example, if the minimum value is a and the bin width is b, then the boundary of the first bin is [a, a + b], the boundary of the second bin is [a + b, a + 2b], and so on. The data consists of k discrete intervals, each containing a different number of data points.
The equidistant scattering was selected for data processing in this study. For temperature-related parameters, the study refers to the safe operating range of the compressor and combines it with the quartile distribution of the data to divide it into three intervals, low, medium, and high. For parameters such as energy consumption, compressor frequency, and EXV steps, the study divided them into thresholds based on the variable gradient set in the experiment. This study employs equal-width discretization, which does not disrupt the continuity of physical parameters. Meanwhile, this method is more aligned with engineering cognition and offers high stability. The binning method selected in this study did not induce sign inversion. This strategy ensures that the energy consumption rules for VRF systems are finally extracted not only to align with data patterns but also to exhibit physical rationality and engineering reliability.
In the classical theory of the ARM method, the core function of the support threshold is to filter out low-frequency, meaningless item sets, while the confidence threshold serves to ensure the reliability of rule predictions. According to the minimum support and confidence framework proposed by Han [43], the minimum support needs to cover at least 5% to 10% of the sample size to prevent rules from arising solely from accidental data fluctuations. The minimum confidence must be higher than the probability of random correlations, and it is usually set at 30% to 50% to ensure that the rule prediction accuracy is better than random guesses. The threshold setting in this study strictly adheres to this classical criterion. The operating data of HVAC systems exhibit an uneven characteristic. Core parameters are concentrated in high-frequency ranges, while extreme operating conditions are scattered in low-frequency ranges. Therefore, general domain standards generally control the support within 10% to 25% and the confidence within 30% to 50%. The thresholds in this study fall entirely within this range and are fine-tuned according to specific operating conditions, which not only conform to general standards but also adapt to the data characteristics of this study.

5.2. Results of Different Set Temperatures for Variable Refrigerant Flow Systems

In the experiment involving the VRF variable setting temperature, choose to open conditions 1 and 2 for the indoor unit and equipment. This ensures stable operation of the indoor unit and avoids interference from other factors. The indoor units are set to run at a high airspeed. The Apriori algorithm was applied to mine the data from this operating condition, with a minimum itemset support of 20% and a minimum rule confidence of 50%. The resulting association rules are shown in Table 8. A support of 20% ensures coverage of such high-frequency combinations, while a confidence of 50% corresponds to a rule prediction accuracy of 85% and covers 82% of high-frequency temperature adjustment scenarios. If the confidence is reduced to 30%, although the number of rules increases by 35%, it will incorporate weak rules that contradict thermodynamic laws, such as high ambient temperature leading to low energy consumption, which undermines the value of engineering applications.
Based on the analysis of the table above, the following observations are made. When the temperature is set to 18 °C, 24 °C, and 30 °C, there is a correlation rule that low 1 EXV steps at the result in low energy consumption, with confidence levels of 58.947%, 84.848%, and 100%, respectively. This indicates a positive correlation between EXV step count and energy consumption, which remains consistent regardless of set temperature changes. The confidence levels of the working condition that high inlet pipe temperature of Terminal 2 leads to high energy consumption are 80.952%, 75.862%, and 100% at 18 °C, 24 °C, and 30 °C, respectively. At 18 °C, a high inlet pipe temperature of Terminal 2 leads to high energy consumption, with a confidence level of 80.952%; at 24 °C, a relatively low inlet pipe temperature of Terminal 2 leads to low energy consumption, with a confidence level of 75.862%; and at 30 °C, a high inlet pipe temperature of Terminal 2 leads to low energy consumption, with a confidence level of 100%. This indicates that as the set temperature increases, the influence of the terminal inlet temperature on energy consumption transitions from positive to negative. Under the conditions of 18 °C, 24 °C, and 30 °C, the association rules between the inlet pipe temperature of Terminal 2 and energy consumption, along with their corresponding confidence levels, are as follows. At 18 °C, a high inlet pipe temperature of Terminal 2 corresponds to high energy consumption, with a confidence level of 80.952%. At 24 °C, a relatively low inlet pipe temperature of Terminal 2 corresponds to low energy consumption, with a confidence level of 75.862%. And at 30 °C, a high inlet pipe temperature of Terminal 2 corresponds to low energy consumption, with a confidence level of 100%. These results indicate that as the set temperature increases, the trend of energy consumption concerning terminal outlet temperature reverses, shifting from a positive to a negative correlation. This may be attributed to changes in the system’s operating mode and energy distribution as the set temperature increases, leading to corresponding alterations in the relationship between terminal outlet temperature and energy consumption.
Furthermore, at 18 °C, the confidence level that a high inverter discharge temperature leads to high energy consumption is 100%. At 24 °C, the confidence level that a relatively low inverter discharge temperature leads to low energy consumption is 90.909%. At 30 °C, the confidence level that a low inverter discharge temperature leads to moderate energy consumption is 100%. This indicates that as the indoor temperature increases, the influence of variable frequency discharge temperature on energy consumption transitions from positive to negative, and the rate of change in this influence concerning indoor set temperature is relatively slow. This reflects that the intrinsic relationship between the variable frequency discharge temperature and energy consumption differs across different indoor set temperatures. As temperature increases, it may alter the system’s energy conversion and consumption mechanisms, thereby influencing the relationship between the two. The assertion high variable frequency shell top oil temperature causes high energy consumption demonstrates confidence levels of 66.379%, 90.323%, and 74.545% at temperatures of 18 °C, 24 °C, and 30 °C, respectively. This indicates that as the set temperature increases, the influence of variable frequency shell top oil temperature on energy consumption transitions from positive to negative, further demonstrating the significant moderating role of temperature on the relationships between system parameters and energy consumption.

5.3. Results of Different Load Rates for Variable Refrigerant Flow Systems

In the experiment investigating the variation in activated indoor unit terminals in a VRF system, the indoor set temperature was maintained at 18 °C. The total experimental duration was 4166 s. The changes in indoor unit terminals are shown in Table 9. The experiment adhered to the principle of a single variable, with all indoor units set to high fan speed operation. The Apriori algorithm was employed for data mining, with a minimum support threshold of 10% and a minimum confidence threshold of 50%. The generated association rules are detailed in Table 10. A support of 10% can accurately cover such critical load switching scenarios, and a confidence of 50% can ensure the physical consistency of the correlation between load rate and parameters.
Comparative analysis of Table 10 reveals as follows. When terminals 1 and 3, 1, 3, and 4, and 3 and 4 are activated separately, the conclusion that the variable frequency exhaust temperature is low, resulting in high energy consumption has a confidence level of 100%, 100%, and 70%, respectively. This indicates that as the number of activated terminals increases, the correlation between energy consumption and variable frequency discharge temperature shifts from negative to positive. Conversely, as the number of terminals decreases, this correlation shifts from positive to negative. This demonstrates the significant impact of changes in the system load ratio on the relationship between the variable frequency discharge temperature and energy consumption. The number of activated terminals likely alters the load demand and refrigerant flow distribution of the system, thereby influencing the correlation between variable frequency discharge temperature and energy consumption. The high inlet–outlet pipe temperature difference at terminal 1 results in high energy consumption when terminals 1 and 3 and 1, 3 and 4 are activated separately, with confidence levels of 100% and 83.333%, respectively. This indicates that as the number of activated terminals increases, the influence of inlet–outlet temperature difference on energy consumption shifts from positive to negative. This can be attributed to the fact that an increase in the number of activated terminals modifies the refrigerant flow rate and heat exchange dynamics within the system, thereby reversing the relationship between inlet–outlet temperature difference and energy consumption.
When Terminals 1 and 3 are activated, a relatively low EXV step count of Terminal 1 leading to high energy consumption has a confidence level of 100%. When Terminals 1, 3, and 4 are activated, a high EXV step count of Terminal 1 leading to high energy consumption has a confidence level of 100%. This indicates that as the number of activated terminals increases, the influence of the EXV step count on energy consumption shifts from negative to positive, reflecting the complex relationship between EXV step adjustments and energy consumption under varying system loads. When terminals 1 and 3 are activated, the fixed-frequency exhaust temperature is low, resulting in moderate energy consumption, with a confidence level of 86.667%. When terminals 1, 3, and 4 are activated, the fixed-frequency exhaust temperature is high, resulting in high energy consumption, with a confidence level of 100%. When terminals 3 and 4 are activated, the fixed-frequency exhaust temperature is high, resulting in low energy consumption, with a confidence level of 77.778%. This indicates that as the number of activated terminals increases, the influence of fixed frequency discharge temperature on energy consumption shifts from negative to positive. Conversely, as the number of activated terminals decreases, this influence shifts from positive to negative, highlighting the significant impact of terminal activation count on the relationship between fixed frequency discharge temperature and energy consumption.
When terminals 1, 3, and 4 are activated, the temperature difference between the inlet and outlet pipes of terminal 4 is moderate, resulting in high energy consumption, with a confidence level of 63.793%. The activation of terminals 3 and 4 has been shown to result in a significant temperature difference between the inlet and outlet pipes of terminal 4, leading to a substantial increase in energy consumption. This assertion has a confidence level of 100%. This indicates that when the number of activated terminals decreases, the influence of inlet–outlet temperature difference on energy consumption shifts from negative to positive, further validating the close relationship between terminal activation count and the correlation between temperature difference and energy consumption. When Terminals 1, 3, and 4 are activated, the confidence level that a high EXV step count of Terminal 4 leads to high energy consumption is 100%. For Terminal 3 and 4 activations, the same rule holds with 96.19% confidence. This indicates that when the number of activated terminals decreases, the influence of EXV step count on energy consumption remains consistent, demonstrating the stability of the positive correlation between Terminal 4’s EXV step count and energy consumption across different load ratios.

5.4. Results of Different Fan Speeds for Variable Refrigerant Flow Systems

To control for the influence of extraneous factors, experimental data under similar outdoor meteorological parameters were selected for analysis in the variable fan speed experiment of VRF systems. The indoor temperature was set at 24 °C, and only indoor unit terminal 1 was activated. Experiments at low, medium, and high wind speeds were conducted separately to ensure the stable operation of each group under their respective set conditions. The minimum conditional support is set to 5%, and the minimum rule confidence is set to 30% to produce association rules that meet the minimum support and confidence requirements, as demonstrated in Table 11. A support of 5% can accurately cover such critical load switching scenarios, and a confidence of 30% can ensure the physical consistency of the correlation between wind speed and parameters.
Comparative analysis of the above table reveals that. When Terminal 1 is activated and set to low speed, the confidence level for the conclusion that a high inverter discharge temperature leads to high energy consumption is 33.333%. At medium speed, the confidence level that a high inverter discharge temperature leads to high energy consumption is 100%. At high speed, the confidence level for the conclusion that a low inverter discharge temperature leads to low energy consumption is 100%. This indicates that as airspeed increases, the influence of variable frequency discharge temperature on energy consumption is generally positive yet manifests differently under high airspeed conditions. This divergence may arise from airspeed-dependent changes in indoor–outdoor heat exchange efficiency, which in turn alter the relationship between discharge temperature and energy consumption.
When the terminal 1 wind speed is set to low wind speed, the probability of high variable frequency shell top oil temperature leading to high energy consumption is 65.217%. When set to medium wind speed, the probability is 100%. When set to high wind speed, the probability is 100%. This finding suggests that the relationship between increased variable frequency shell top oil temperature and increased energy consumption remains constant with variations in wind speed. This indicates that the intrinsic relationship between these variables remains relatively stable under different wind speed conditions.

6. Conclusions

To address the core issues of winter heating operation for VRF systems in Zhengzhou and regions with similar climates, this study has achieved the following clear outcomes through multi-operating-condition experiments and MLR-ARM collaborative analysis.
(1)
Aiming at the characteristics of strong parameter coupling and large operating condition fluctuations in VRF winter heating, this study integrates the quantitative advantages of MLR and the dynamic adaptability advantages of ARM, overcoming the limitations of single methods, where one cannot quantify adjustment benchmarks and the other cannot adapt to changes in operating conditions. Specifically, the 8-parameter energy consumption prediction model constructed by MLR achieves a goodness of fit with R2 = 0.925 (p < 0.001), and the dynamic operating condition rules mined by ARM reach a maximum confidence level of 100%. This provides a reusable methodological template for data mining of VRF systems in climates with similar conditions.
(2)
The MLR model clarifies the quantitative impact of key parameters on energy consumption. For a fixed frequency 1 discharge temperature, every 1 °C increase in discharge temperature leads to an average increase of 0.688 kWh in system energy consumption. Every 100-step increase in EXV steps results in a 0.4 kWh increase in energy consumption. This result quantifies the correlation between parameters and energy consumption for VRF winter heating in central China for the first time, filling the gap in the lack of quantitative rules for winter heating scenarios.
(3)
Mining results via ARM reveal that the correlation between parameters and energy consumption changes significantly with operating conditions. In terms of temperature gradients, under high load at 30 °C, the confidence level that lower EXV steps lead to lower energy consumption is 100%, which is significantly higher than the 58.9% confidence level under low load at 18 °C. In terms of load rate, under high load conditions with four terminal units activated, the impact intensity of Tff1 on energy consumption increases by 60% compared to low load conditions (with only one terminal unit activated). This rule provides direct evidence for scenario-adapted differentiated adjustment.
The results of this study provide data to support the optimization of the energy efficiency of VRF systems. For example, reducing the number of EXV steps and setting temperatures can reduce energy consumption while ensuring thermal comfort. Further research could involve fusing real-time weather data with the behavior patterns of users to improve the ability of the model to generalize and explore the potential application of data mining techniques in fault warning. The results obtained in this paper are based on an analysis of data from the central China region and are highly instructive for that region. Although the results of this study can only serve as a reference for other fields, the research methods and results proposed in this study can provide guidance for different regions.

Author Contributions

Conceptualization, J.Z.; Methodology, J.Z.; Formal analysis, R.L.; Investigation, Z.X., X.Z. and Y.G.; Resources, Z.X., X.Z. and R.L.; Data curation, J.Z.; Writing—original draft, X.L.; Writing—review & editing, X.L., C.D. and Y.G.; Supervision, J.Z., C.D. and Y.G. All authors have read and agreed to the published version of the manuscript.

Funding

The study was supported by Henan Provincial Key Science and Technology Research Projects (Grant No. 252102321079) and Henan Provincial Natural Science Foundation (Grant No. 252300421524).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

CeEnergy consumption (kWh)
Tff1Fixed frequency 1 discharge temperature (°C)
Tff2Fixed frequency 2 discharge temperature (°C)
TvstoVariable frequency shell top oil temperature (°C)
Tsto1Fixed frequency 1 shell top oil temperature (°C)
Tsto2Fixed frequency 2 shell top oil temperature (°C)
PhaAverage high pressure (MPa)
PlaAverage low pressure (MPa)
FcCompressor frequency (Hz)
CoutInstantaneous output capacity (KW)
IrmaCompressor current effective value (A)
UrmsBusbar voltage effective value (V)
TboxElectrical box temperature (°C)
IuU-phase current value (A)
IvV-phase current value (A)
Uf1Fan 1 bus voltage (V)
Uf2Fan 2 bus voltage (V)
TinInlet pipe temperature (°C)
ToutOutlet pipe temperature (°C)
TeAmbient temperature (°C)
KexvEXV step count
VRFVariable refrigerant flow
HVACHeating, ventilation and air conditioning
CO2Carbon dioxide
ARMAssociation rule mining
MLRMultiple linear regression
VIFVariance inflation factor

References

  1. Li, M.; Zhang, Y.; Yu, G.; Sun, J.; Liu, J.; Wang, Y.; Yu, Y. Evaluation of Low-Carbon Development in the Construction Industry and Forecast of Trends: A Case Study of the Yangtze River Delta Region. Sustainability 2025, 17, 5435. [Google Scholar] [CrossRef]
  2. Alsanie, G. Investigating the Impact of Digitalization on Resource Use, Energy Use, and Waste Reduction Towards Sustainability: Considering Environmental Awareness as a Moderator. Sustainability 2025, 17, 4073. [Google Scholar] [CrossRef]
  3. Seraj, M.; Seraj, F.T. The Impact of Sustainable Financial Development and Green Energy Transition on Climate Change in the World’s Highest Carbon-Emitting Countries. Sustainability 2025, 17, 3781. [Google Scholar] [CrossRef]
  4. Yang, L.; Yan, H.; Lam, J.C. Thermal comfort and building energy consumption implications—A review. Appl. Energy 2014, 115, 164–173. [Google Scholar] [CrossRef]
  5. Yu, X.; Yan, D.; Sun, K.; Hong, T.; Zhu, D. Comparative study of the cooling energy performance of variable refrigerant flow systems and variable air volume systems in office buildings. Appl. Energy 2016, 183, 725–736. [Google Scholar] [CrossRef]
  6. Yan, Y.; Cai, J.; Tang, Y.; Yu, Y. A Decentralized Boltzmann-machine-based fault diagnosis method for sensors of Air Handling Units in HVACs. J. Build. Eng. 2022, 50, 104130. [Google Scholar] [CrossRef]
  7. Zhang, G.; Xiao, H.; Zhang, P.; Wang, B.; Li, X.; Shi, W.; Cao, Y. Review on recent developments of variable refrigerant flow systems since 2015. Energy Build. 2019, 198, 444–466. [Google Scholar] [CrossRef]
  8. Aynur, T.N.; Hwang, Y.; Radermacher, R. Simulation comparison of VAV and VRF air conditioning systems in an existing building for the cooling season. Energy Build. 2009, 41, 1143–1150. [Google Scholar] [CrossRef]
  9. Kim, D.; Cox, S.J.; Cho, H.; Im, P. Evaluation of energy savings potential of variable refrigerant flow (VRF) from variable air volume (VAV) in the U.S. climate locations. Energy Rep. 2017, 3, 85–93. [Google Scholar] [CrossRef]
  10. Özahi, E.; Abuşoğlu, A.; Kutlar, A.İ.; Dağcı, O. A comparative thermodynamic and economic analysis and assessment of a conventional HVAC and a VRF system in a social and cultural center building. Energy Build. 2017, 140, 196–209. [Google Scholar] [CrossRef]
  11. Keleher, M.; Narayanan, R. Performance analysis of alternative HVAC systems incorporating renewable energies in sub-tropical climates. Energy Procedia 2019, 160, 147–154. [Google Scholar] [CrossRef]
  12. Liu, H.; Zhang, Z.; Li, H.; Wang, S.; Hu, B.; Wang, R. Research and development of a permanent-magnet synchronous frequency-convertible centrifugal compressor. Int. J. Refrig. 2020, 117, 33–43. [Google Scholar] [CrossRef]
  13. Darwiche, M.; Faraj, J.; Ali, S.; Murr, R.; Taher, R.; El Hage, H.; Khaled, M. Using geothermal energy in enhancing all-air HVAC system performance—Case study, thermal analysis and economic insights. Unconv. Resour. 2025, 5, 100125. [Google Scholar] [CrossRef]
  14. Ra, N.; Ghosh, A.; Bhattacharjee, A. IoT-based smart energy management for solar vanadium redox flow battery powered switchable building glazing satisfying the HVAC system of EV charging stations. Energy Convers. Manag. 2023, 281, 116851. [Google Scholar] [CrossRef]
  15. Wang, B.; Chen, Z.; You, G.; Ding, J.; Cheng, G.; Bui, T. Improving indoor air quality and cooling efficiency: Indirect dew-point evaporative cooling in South China summers. Energy Build. 2024, 324, 114908. [Google Scholar] [CrossRef]
  16. Shyu, H.-Y.; Bair, R.A.; Castro, C.J.; Xaba, L.P.; Ncube, T.T.; Cottingham, R.; Mutsakatira, E.; Yeh, D.H. Advanced Non-Sewered sanitation system for onsite water recycling in a South African informal settlement. Water Res. X 2025, 29, 100342. [Google Scholar] [CrossRef]
  17. Tian, Z.; Lin, X.; Lu, Y.; Song, W.; Niu, J. Imbalanced data-oriented model learning method for ultra-short-term air conditioning load prediction. Energy Build. 2023, 286, 112931. [Google Scholar] [CrossRef]
  18. Zhao, Y.; Zhang, C.; Zhang, Y.; Wang, Z.; Li, J. A review of data mining technologies in building energy systems: Load prediction, pattern identification, fault detection and diagnosis. Energy Built Environ. 2020, 1, 149–164. [Google Scholar] [CrossRef]
  19. Fan, C.; Xiao, F.; Li, Z.; Wang, J. Unsupervised data analytics in mining big building operational data for energy efficiency enhancement: A review. Energy Build. 2018, 159, 296–308. [Google Scholar] [CrossRef]
  20. Tian, Z.; Lu, Z.; Lu, Y.; Zhang, Q.; Lin, X.; Niu, J. An unsupervised data mining-based framework for evaluation and optimization of operation strategy of HVAC system. Energy 2023, 291, 130043. [Google Scholar] [CrossRef]
  21. Wang, M.; Wang, Z.; Geng, Y.; Lin, B. Interpreting the neural network model for HVAC system energy data mining. Build. Environ. 2022, 209, 108449. [Google Scholar] [CrossRef]
  22. Xu, Y.; Yan, C.; Shi, J.; Lu, Z.; Niu, X.; Jiang, Y.; Zhu, F. An anomaly detection and dynamic energy performance evaluation method for HVAC systems based on data mining. Sustain. Energy Technol. Assess. 2021, 44, 101092. [Google Scholar] [CrossRef]
  23. Qian, M.; Hu, S.; Wu, Y.; Liu, H.; Yan, D. Quantitative index for temporal and spatial patterns of occupant behavior based on VRF big data. Energy Build. 2024, 322, 114683. [Google Scholar] [CrossRef]
  24. Haghighat, F.; Fung, B.C.M.; Zhou, L. A novel methodology for knowledge discovery through mining associations between building operational data. Energy Build. 2012, 47, 430–440. [Google Scholar] [CrossRef]
  25. Xiao, F.; Fan, C. Data mining in building automation system for improving building operational performance. Energy Build. 2014, 75, 109–118. [Google Scholar] [CrossRef]
  26. Li, G.; Hu, Y.; Chen, H.; Li, H.; Hu, M.; Guo, Y.; Liu, J.; Sun, S.; Sun, M. Data partitioning and association mining for identifying VRF energy consumption patterns under various part loads and refrigerant charge conditions. Appl. Energy 2017, 185, 846–861. [Google Scholar] [CrossRef]
  27. Wu, Y.; Hu, S.; Qian, M.; Xiong, J.; Yan, D. Advanced analysis of operating parameters utilizing big data to improve building cooling equipment energy efficiency standards. Sustain. Cities Soc. 2024, 109, 105539. [Google Scholar] [CrossRef]
  28. Zhou, X.; Wang, N.; Zou, J.; Liu, G.; Zhuang, X.; Liu, G. Analysis and prediction of energy consumption in office buildings with variable refrigerant flow systems: A case study. J. Build. Eng. 2024, 97, 110936. [Google Scholar] [CrossRef]
  29. Movahed, P.; Taheri, S.; Razban, A. A bi-level data-driven framework for fault-detection and diagnosis of HVAC systems. Appl. Energy 2023, 339, 120948. [Google Scholar] [CrossRef]
  30. Sha, X.; Ma, Z.; Sethuvenkatraman, S.; Li, W. A novel rule mining method for knowledge discovery of relationships among indoor air quality, HVAC operation and occupants’ activities. Build. Environ. 2024, 260, 111670. [Google Scholar] [CrossRef]
  31. Zhang, C.; Xue, X.; Zhao, Y.; Zhang, X.; Li, T. An improved association rule mining-based method for revealing operational problems of building heating, ventilation and air conditioning (HVAC) systems. Appl. Energy 2019, 253, 113492. [Google Scholar] [CrossRef]
  32. Zeng, Y.; Zhang, Z.; Kusiak, A. Predictive modeling and optimization of a multi-zone HVAC system with data mining and firefly algorithms. Energy 2015, 86, 393–402. [Google Scholar] [CrossRef]
  33. Du, Z.; Liang, X.; Chen, S.; Li, P.; Zhu, X.; Chen, K.; Jin, X. Domain adaptation deep learning and its T-S diagnosis networks for the cross-control and cross-condition scenarios in data center HVAC systems. Energy 2023, 280, 128084. [Google Scholar] [CrossRef]
  34. Hardy, S.; Van Hertem, D.; Ergun, H. Application of Association Rule Mining in offshore HVAC transmission topology optimization. Electr. Power Syst. Res. 2022, 211, 108358. [Google Scholar] [CrossRef]
  35. Yang, C.; Chen, H. Analysis of power consumption influencing factors in different modes of semiconductor factory refrigeration station based on association rule mining. Int. J. Refrig. 2025, 175, 187–195. [Google Scholar] [CrossRef]
  36. Guo, Y.; Liu, Y.; Wang, Y.; Wang, Z.; Zhang, Z.; Xue, P. Advance and prospect of machine learning based fault detection and diagnosis in air conditioning systems. Renew. Sustain. Energy Rev. 2024, 205, 114853. [Google Scholar] [CrossRef]
  37. Bairami-Khankandi, S.; Bolbot, V.; BahooToroody, A.; Goerlandt, F. A systems-theoretic approach using association rule mining and predictive Bayesian trend analysis to identify patterns in maritime accident causes. Reliab. Eng. Syst. Saf. 2025, 258, 110911. [Google Scholar] [CrossRef]
  38. Sun, C.; Yuan, L.; Cao, S.; Xia, G.; Liu, Y.; Wu, X. Identifying supply-demand mismatches in district heating system based on association rule mining. Energy 2023, 280, 128124. [Google Scholar] [CrossRef]
  39. Guo, Y.; Li, G.; Chen, H.; Wang, J.; Guo, M.; Sun, S.; Hu, W. Optimized neural network-based fault diagnosis strategy for VRF system in heating mode using data mining. Appl. Therm. Eng. 2017, 125, 1402–1413. [Google Scholar] [CrossRef]
  40. Piscitelli, M.S.; Mazzarelli, D.M.; Capozzoli, A. Enhancing operational performance of AHUs through an advanced fault detection and diagnosis process based on temporal association and decision rules. Energy Build. 2020, 226, 110369. [Google Scholar] [CrossRef]
  41. Wang, M.; Hu, E.; Chen, L. CFD analysis and optimization of thermal stratification in a Thermal Diode Tank (TDT). J. Energy Storage 2024, 76, 109837. [Google Scholar] [CrossRef]
  42. Wang, M.; Hu, E.; Chen, L. Simulation of a radiation-enhanced thermal diode tank (RTDT) assisted refrigeration and air-conditioning (RAC) system using TRNSYS. J. Build. Eng. 2024, 82, 108168. [Google Scholar] [CrossRef]
  43. Han, J.; Kamber, M.; Pei, J. Data Mining: Concepts and Techniques, 4th ed.; Morgan Kaufmann (Elsevier): Burlington, MA, USA, 2022. [Google Scholar]
Figure 1. Site diagram of VRF system.
Figure 1. Site diagram of VRF system.
Sustainability 17 08974 g001
Figure 2. Flow chart of VRF system.
Figure 2. Flow chart of VRF system.
Sustainability 17 08974 g002
Figure 3. Research framework diagram.
Figure 3. Research framework diagram.
Sustainability 17 08974 g003
Figure 4. Correlation analysis results of 24 characteristic variables.
Figure 4. Correlation analysis results of 24 characteristic variables.
Sustainability 17 08974 g004
Figure 5. Collinearity statistics for 11 characteristic variables.
Figure 5. Collinearity statistics for 11 characteristic variables.
Sustainability 17 08974 g005
Figure 6. Collinearity diagnosis for compressor frequency and instantaneous output capacity.
Figure 6. Collinearity diagnosis for compressor frequency and instantaneous output capacity.
Sustainability 17 08974 g006
Figure 7. Collinearity analysis of shell top oil temperature and discharge temperature.
Figure 7. Collinearity analysis of shell top oil temperature and discharge temperature.
Sustainability 17 08974 g007
Figure 8. Collinearity analysis of inlet and outlet pipe temperatures.
Figure 8. Collinearity analysis of inlet and outlet pipe temperatures.
Sustainability 17 08974 g008
Figure 9. Collinearity statistics of optimized characteristic variables.
Figure 9. Collinearity statistics of optimized characteristic variables.
Sustainability 17 08974 g009
Figure 10. Standardized residual plot of the regression model.
Figure 10. Standardized residual plot of the regression model.
Sustainability 17 08974 g010
Figure 11. Fitting plot of the regression equation.
Figure 11. Fitting plot of the regression equation.
Sustainability 17 08974 g011
Table 1. Experimental condition settings.
Table 1. Experimental condition settings.
Experiment Group No.Indoor Set Temperature (°C)Terminal Activation Status
118Terminal 1
2Terminal 1, 2
3Terminal 1, 2, 3
4Terminal 2, 3
5Terminal 2, 3, 4
6Terminal 1, 3
7Terminal 1, 3, 4
8Terminal 3, 4
924Terminal 1
10Terminal 1, 2
11Terminal 1, 2, 3
1230Terminal 1
13Terminal 1, 2
14Terminal 1, 2, 3
Table 2. Airspeed experiment settings.
Table 2. Airspeed experiment settings.
Experiment Group No.Indoor Set Temperature (°C)Terminal Activation StatusIndoor Airspeed Setting
124Terminal 1High airspeed
2Medium airspeed
3Low airspeed
4Terminals 1, 3High airspeed
5Medium airspeed
6Low airspeed
Table 3. Regression model summary.
Table 3. Regression model summary.
ModelUnstandardized CoefficientsStandardized CoefficienttSig.Collinearity Statistics
BStandard DeviationBetaToleranceVIF
Constant527.4201.081-487.9740.000--
Fixed frequency 1 discharge temperature0.6960.0110.79063.1440.0000.1407.134
Fixed frequency 1 shell top oil temperature−0.5900.007−1.724−85.8470.0000.05418.384
Fixed frequency 2 shell top oil temperature0.6990.0670.14910.5010.0000.1089.229
Average low pressure−0.3330.019−0.189−17.8910.0000.1975.072
Compressor frequency0.0110.0510.0230.2080.8350.002543.857
Instantaneous output capacity0.0130.0290.0490.4460.6560.002560.090
Electrical box temperature−0.0750.040−0.010−1.8620.0630.7521.329
Inlet pipe temperature−0.4240.028−0.257−15.1680.0000.07613.106
Outlet pipe temperature0.1440.0150.1779.8870.0000.06914.574
Ambient temperature0.3030.0380.0627.9920.0000.3622.762
EXV step count0.0060.0010.0407.7680.0000.8121.231
R = 0.968, R2 = 0.938, Adjusted R2 = 0.937, F = 3886.796
Table 4. Collinearity diagnosis for compressor frequency and instantaneous output capacity.
Table 4. Collinearity diagnosis for compressor frequency and instantaneous output capacity.
DimensionEigenvalueConditional IndicatorsConstantVariance Proportions
Compressor FrequencyInstantaneous Output Capacity
12.9851.0000.000.000.00
20.01514.1590.970.000.00
32.885 × 10−5321.6740.031.001.00
Table 5. Collinearity analysis of shell top oil temperature and discharge temperature.
Table 5. Collinearity analysis of shell top oil temperature and discharge temperature.
DimensionEigenvalueConditional IndicatorsConstantVariance Proportions
Fixed Frequency 1 Shell Top Oil TemperatureFixed Frequency 2 Shell Top Oil TemperatureFixed Frequency 1 Discharge Temperature
13.7451.0000.000.000.000.01
20.1984.3520.010.040.000.13
30.0558.2800.000.330.000.84
40.00240.4670.980.931.000.03
Table 6. Collinearity analysis of inlet and outlet pipe temperatures.
Table 6. Collinearity analysis of inlet and outlet pipe temperatures.
DimensionEigenvalueConditional IndicatorsConstantVariance Proportions
Inlet Pipe TemperatureOutlet Pipe Temperature
12.9971.0000.000.000.00
20.00332.6580.230.000.07
30.000150.7600.771.000.93
Table 7. Optimized model summary.
Table 7. Optimized model summary.
ModelUnstandardized CoefficientsStandardized CoefficienttSig.Collinearity Statistics
BStandard DeviationBetaToleranceVIF
Constant527.1211.050-502.1270.000--
Fixed frequency 1 discharge temperature0.6880.0120.78859.4960.0000.1496.709
Fixed frequency 1 shell top oil temperature−0.5380.005−1.574−100.9120.0000.1089.294
Average low pressure−0.2740.020−0.156−13.7510.0000.2044.903
Compressor frequency0.0190.0050.0413.8700.0000.2284.386
Electrical box temperature−0.1740.044−0.023−3.9740.0000.7591.318
Inlet and Outlet Pipe Temperature Difference−0.0530.011−0.036−4.8950.0000.4972.011
Ambient temperature0.0740.0400.0151.8440.0450.3882.576
EXV step count0.0040.0010.0264.6210.0000.8321.201
R = 0.962, R2 = 0.925, Adjusted R2 = 0.925, F = 4421.723
Table 8. Association rules (single antecedent) at different temperatures with terminals 1 and 2 activated.
Table 8. Association rules (single antecedent) at different temperatures with terminals 1 and 2 activated.
Set TemperatureAntecedentConsequentSupport (%)Confidence (%)
18 °CLow inlet temperature of Terminal 2 Low energy consumption20.96866.667
Low outlet temperature of Terminal 2Low energy consumption24.194100
Low oil temperature of the fixed frequency shellLow energy consumption30.64598.246
Low variable frequency discharge temperatureLow energy consumption42.47360.759
Low EXV step count of Terminal 1Low energy consumption51.07558.947
Moderate variable frequency discharge temperatureModerate energy consumption23.11851.163
High outlet temperature of Terminal 1Moderate energy consumption34.40982.812
Low inlet temperature of Terminal 1Moderate energy consumption39.78554.054
High inlet temperature of Terminal 2High energy consumption45.16180.952
High outlet temperature of Terminal 2 High energy consumption51.07581.053
High oil temperature at the top of the fixed frequency shellHigh energy consumption57.52767.29
High fixed frequency discharge temperatureHigh energy consumption62.36662.069
High discharge temperature of variable frequency driveHigh energy consumption34.409100
High variable frequency shell top oil temperatureHigh energy consumption62.36666.379
24 °CLow outlet temperature of Terminal 2Low energy consumption22.05980
High outlet temperature of Terminal 1 Low energy consumption26.47152.778
Relatively low variable frequency discharge temperatureLow energy consumption24.26590.909
Low variable frequency discharge temperatureLow energy consumption28.67666.667
Moderate inlet temperature of Terminal 1 Low energy consumption29.41280
High outlet temperature of Terminal 2 Low energy consumption34.55963.830
Low inlet temperature of Terminal 2Low energy consumption42.64775.862
High variable frequency shell top oil temperatureLow energy consumption45.58890.323
Low EXV step count of Terminal 1Low energy consumption48.52984.848
High fixed frequency discharge temperatureLow energy consumption76.47152.885
Low outlet temperature of Terminal 1Moderate energy consumption29.41275
Low inlet temperature of Terminal 1Moderate energy consumption42.64758.621
High outlet temperature of Terminal 2High energy consumption23.52971.875
High inlet temperature of Terminal 1High energy consumption27.94160.526
High inlet temperature of Terminal 2High energy consumption28.67658.974
30 °CModerate variable frequency discharge temperatureLow energy consumption20.16858.333
Low EXV step count of Terminal 1Low energy consumption25.210100
High inlet temperature of Terminal 2Low energy consumption33.613100
High outlet temperature of Terminal 2 Low energy consumption36.134100
High instantaneous output capacityModerate energy consumption38.65589.130
Low variable frequency shell top oil temperatureModerate energy consumption46.21874.545
Low variable frequency discharge temperatureModerate energy consumption26.050100
High total capacity demandModerate energy consumption57.14360.294
High outlet temperature of Terminal 1Moderate energy consumption63.86659.211
Moderate outlet temperature of Terminal 2Moderate energy consumption34.454100
Low inlet temperature of Terminal 1Moderate energy consumption64.70658.442
Low outlet temperature of Terminal 2High energy consumption29.41288.571
High variable frequency shell top oil temperatureHigh energy consumption51.26150.820
Low inlet temperature of Terminal 2High energy consumption51.26150.820
Table 9. Terminal activation sequence at 18 °C.
Table 9. Terminal activation sequence at 18 °C.
Experiment No.Terminal Activation StatusTerminal Change
1Terminals 1, 3Initial state
2Terminals 1, 3, 4Activate
Terminal 4
3Terminals 3, 4Deactivate
Terminal 1
Table 10. Association rules (single antecedent) with terminals activated.
Table 10. Association rules (single antecedent) with terminals activated.
Terminal Activation StatusAntecedentConsequentSupport (%)Confidence (%)
Terminals 1, 3Low EXV step count of Terminal 3Low energy consumption10.07687.5
Low Terminal 3 inlet–outlet temperature differenceLow energy consumption11.33591.111
High fixed frequency discharge temperatureLow energy consumption40.55452.795
High fixed frequency shell top oil temperatureLow energy consumption41.31051.829
Low instantaneous output capacityModerate energy consumption16.12150.0
High inlet–outlet temperature difference in Terminal 1Moderate energy consumption16.87794.030
Low fixed frequency discharge temperatureModerate energy consumption22.67086.667
High EXV step count of Terminal 3Moderate energy consumption23.17482.609
High variable frequency discharge temperatureHigh energy consumption10.32792.683
Low EXV step count of Terminal 1High energy consumption12.846100
Low variable frequency discharge temperatureHigh energy consumption13.602100
High instantaneous output capacityHigh energy consumption17.38060.870
High fixed frequency discharge temperatureHigh energy consumption19.14476.316
High inlet–outlet temperature difference in Terminal 3High energy consumption30.47980.165
High EXV step count of Terminal 3High energy consumption30.479100
High inlet–outlet temperaturedifference in Terminal 1High energy consumption33.501100
Terminals 1, 3, and 4Moderate fixed frequency shell top oil temperatureLow energy consumption18.99464.706
Low variable frequency shell top oil temperatureLow energy consumption15.64260.714
Low instantaneous output capacityLow energy consumption21.78879.487
Low fixed frequency shell top oil temperatureModerate energy consumption11.73257.143
High EXV step count of Terminal 4Moderate energy consumption14.52569.231
Low inlet–outlet temperaturedifference in Terminal 1High energy consumption13.40883.333
High fixed frequency discharge temperatureHigh energy consumption12.291100
High EXV step count of Terminal 1High energy consumption13.966100
High fixed frequency shell top oil temperatureHigh energy consumption18.994100
High EXV step count of Terminal 4High energy consumption20.112100
High variable frequency discharge temperatureHigh energy consumption20.670100
High fixed frequency discharge temperatureHigh energy consumption23.46490.476
Moderate inlet–outlet temperature difference in Terminal 4High energy consumption32.40263.793
Terminals 3 and 4High compressor frequencyLow energy consumption11.67591.304
High instantaneous output capacityLow energy consumption11.67591.304
High fixed frequency discharge temperatureLow energy consumption13.70677.778
Low EXV step count of Terminal 4Low energy consumption19.28955.263
Low inlet–outlet temperaturedifference in Terminal 3Moderate energy consumption21.32050.0
Low inlet–outlet temperaturedifference in Terminal 4Moderate energy consumption32.99552.308
Moderate EXV step count of Terminal 4High energy consumption27.41168.519
Relatively low inlet–outlet temperature difference in Terminal 3 High energy consumption24.87373.469
High inlet–outlet temperature difference in Terminal 4High energy consumption14.213100
High fixed frequency shell top oil temperatureHigh energy consumption15.228100
Moderate variable frequency shell top oil temperatureHigh energy consumption17.766100
Moderate variable frequency discharge temperatureHigh energy consumption25.38170
Moderate instantaneous output capacityHigh energy consumption24.873100
High compressor frequencyHigh energy consumption22.84391.111
High inlet–outlet temperature difference in Terminal 3High energy consumption17.766100
High EXV step count of Terminal 4High energy consumption53.29996.190
High EXV step count of Terminal 3High energy consumption73.60469.655
Table 11. Association rules (single antecedent) at different airspeed with terminal 1 activated.
Table 11. Association rules (single antecedent) at different airspeed with terminal 1 activated.
Wind SpeedAntecedentConsequentSupport (%)Confidence (%)
Low airspeedLow fixed frequency shell top oil temperatureHigh energy consumption82.95587.671
High variable frequency shell top oil temperatureHigh energy consumption78.40965.217
High fixed frequency discharge temperatureHigh energy consumption97.72774.419
High variable frequency discharge temperatureHigh energy consumption40.90933.333
Medium airspeedHigh variable frequency discharge temperatureHigh energy consumption24.242100
High variable frequency shell top oil temperatureHigh energy consumption24.242100
Low instantaneous output capacityHigh energy consumption33.33372.727
Low total capacity demandHigh energy consumption59.09153.846
High airspeedLow variable frequency discharge temperatureLow energy consumption11.224100
Low total capacity demandLow energy consumption12.245100
Moderate variable frequency shell top oil temperatureLow energy consumption33.67360.606
High fixed frequency shell top oil temperatureModerate energy consumption14.286100
High fixed frequency discharge temperatureModerate energy consumption28.57192.857
Low variable frequency shell top oil temperatureModerate energy consumption42.85776.190
High variable frequency shell top oil temperatureHigh energy consumption23.469100
High total capacity demandHigh energy consumption47.95997.872
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhu, J.; Liu, X.; Xu, Z.; Zhang, X.; Du, C.; Guo, Y.; Li, R. Research on Operation Data Mining and Analysis of VRF Air-Conditioning Systems Based on ARM and MLR Methods to Enhance Building Sustainability. Sustainability 2025, 17, 8974. https://doi.org/10.3390/su17208974

AMA Style

Zhu J, Liu X, Xu Z, Zhang X, Du C, Guo Y, Li R. Research on Operation Data Mining and Analysis of VRF Air-Conditioning Systems Based on ARM and MLR Methods to Enhance Building Sustainability. Sustainability. 2025; 17(20):8974. https://doi.org/10.3390/su17208974

Chicago/Turabian Style

Zhu, Jiayin, Xin Liu, Zihan Xu, Xingtao Zhang, Congcong Du, Yabin Guo, and Ruixin Li. 2025. "Research on Operation Data Mining and Analysis of VRF Air-Conditioning Systems Based on ARM and MLR Methods to Enhance Building Sustainability" Sustainability 17, no. 20: 8974. https://doi.org/10.3390/su17208974

APA Style

Zhu, J., Liu, X., Xu, Z., Zhang, X., Du, C., Guo, Y., & Li, R. (2025). Research on Operation Data Mining and Analysis of VRF Air-Conditioning Systems Based on ARM and MLR Methods to Enhance Building Sustainability. Sustainability, 17(20), 8974. https://doi.org/10.3390/su17208974

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop