Recognition of Variable-Speed Equipment in an Air-Conditioning System Using Numerical Analysis of Energy-Consumption Data

: Motor-driven equipment (ME) is one of the key components in an air-conditioning system, which contributes to the vast majority of the total energy consumption by air-conditioning systems. Distinguishing variable- and constant-speed equipment is important since the energy simulation models of the two types di ﬀ er. Traditionally, types of ME are known in advance, and energy consumption data are consequently analyzed. However, in the application scenarios of energy consumption data mining, precedent information on the ME type could be missing. Thus, this study applies this process in reverse, providing new insight into energy consumption data of ME to recognize variable-speed ME in an air-conditioning system. The energy consumption data of ME in an air-conditioning system implemented in a commercial building were collected and numerically analyzed. A proposed simple parameter, coe ﬃ cient of the median, and several numerical parameters were calculated and used to distinguish variable- from constant-speed ME. Results showed that the energy consumption data distributions of the two types of ME di ﬀ ered. The proposed coe ﬃ cient of the median could successfully distinguish variable- from constant-speed ME, and it could be applied as an important step in energy consumption data mining of air-conditioning systems.


Introduction
Air-conditioning systems are major sources of energy consumption in buildings [1,2]. Therefore, energy conservation for air-conditioning systems is of primary importance in building energy saving. In an air-conditioning system, the operation of various types of equipment is driven by motors, such as electric chillers, fans, and pumps. Therefore, motor-driven equipment (ME) is a key component in air-conditioning systems, especially in fluid transport subsystems. These subsystems transport refrigerant, cooled/heated air, chilled water, condensing water, or even a glycol solution to ensure the working cycle of the system. The energy consumed by ME makes up the vast majority of the total energy consumption of air-conditioning systems [3]. Thus, reducing the energy costs of ME may be an important approach to energy conservation for air-conditioning systems. Variable-speed ME has been used in air-conditioning systems for more than 40 years [4]. By changing the working frequency corresponding to flow-rate variations, variable-speed ME can generally cost less energy than constant-speed ME under the same conditions [5][6][7].
ChWP-2, and ChWP-3) to guarantee a chilled-water loop between heat exchanger and air-handling units at the terminal, and three glycol water pumps (GWP-1, GWP-2, and GWP-3) to transport the glycol solution between any two or all three of the chiller, the ice-storage tank, and the heat exchanger. Four of the nine pumps could operate with variable speed, while the other five worked with constant speed. However, before this study, the authors were not aware of which pumps were variable-speed ME. This was later established on the basis of energy consumption data mining.

Collection and Processing of ME Energy Consumption Data
A comprehensive energy consumption monitoring platform was established for the airconditioning system to monitor the electricity consumption of each component of the system. However, due to logistics and other constraints, the energy consumption of cooling towers was not available in the present case. Therefore, a follow-up study was employed to determine the energy consumption data of the remaining ME, including the aforementioned nine pumps, five AHUs, and two chillers. As it was expected that this study would be a true reflection of the use of the ME, data collection was conducted without any planned interventions or controls. For the involved ME, the electricity consumption of each device was recorded every hour from 19 May to 20 August 2013. Lastly, 2262 electricity consumption items were obtained for each device.
The raw data of electricity consumption of the ME and the corresponding time downloaded from the platform were then processed before numerical analysis. First, data with zero electricity consumption were eliminated since they demonstrated that the ME was turned off at the time. Second, data with duplicated time records were deleted to avoid repeated counts. Lastly, a processed electricity consumption data list with a time record was obtained for each device.

Numerical Analysis of Energy Consumption Data
Data distribution analysis can reveal the characteristics of energy consumption data. Nevertheless, using data distribution fitting can lead to extremely heavy computational loads, particularly with massive amounts of data. Instead, the numerical characteristics of data are

Collection and Processing of ME Energy Consumption Data
A comprehensive energy consumption monitoring platform was established for the air-conditioning system to monitor the electricity consumption of each component of the system. However, due to logistics and other constraints, the energy consumption of cooling towers was not available in the present case. Therefore, a follow-up study was employed to determine the energy consumption data of the remaining ME, including the aforementioned nine pumps, five AHUs, and two chillers. As it was expected that this study would be a true reflection of the use of the ME, data collection was conducted without any planned interventions or controls. For the involved ME, the electricity consumption of each device was recorded every hour from 19 May to 20 August 2013. Lastly, 2262 electricity consumption items were obtained for each device.
The raw data of electricity consumption of the ME and the corresponding time downloaded from the platform were then processed before numerical analysis. First, data with zero electricity consumption were eliminated since they demonstrated that the ME was turned off at the time. Second, data with duplicated time records were deleted to avoid repeated counts. Lastly, a processed electricity consumption data list with a time record was obtained for each device.

Numerical Analysis of Energy Consumption Data
Data distribution analysis can reveal the characteristics of energy consumption data. Nevertheless, using data distribution fitting can lead to extremely heavy computational loads, particularly with massive amounts of data. Instead, the numerical characteristics of data are important for data Energies 2020, 13, 4975 4 of 14 statistics, as they can intuitively reveal the data distribution and can easily be obtained. In this study, some commonly used parameters were introduced, including dimensional parameters such as maximal value E max , minimal value E min , range E range , mean value E mean , median value E median , mode E mode , variance E var , and standard deviation E sd , and dimensionless parameters, including coefficient of variation C v , coefficient of skewness C s , and coefficient of kurtosis C k .
Specifically, coefficient of variation C v is defined as the ratio of standard deviation to the mean value of data [43], calculated using Equation (1).
Coefficient of skewness C s indicates the asymmetry of data distribution [44], calculated using Equation (2).
where n represents the data quantity, and E i represents the electricity consumption of i.
Coefficient of kurtosis C k reveals the existence of extreme values in the data list [45], calculated using Equation (3).
where the parameters are the same as those in Equation (2). Additionally, a simple dimensionless parameter, coefficient of the median C m , is proposed in the present study, defined as the ratio of the difference between the maximum and median to the range, calculated using Equation (4). The theoretical and practical flow underlying the proposition of C m is discussed in Section 4.2.
The parameters mentioned above were calculated or recorded to understand the numerical characteristics of energy consumption for each device. Afterward, the parameters were compared to determine if their disparities could be used to directly recognize the variable-speed ME.

ME Energy Consumption Distributions
The energy consumption distributions of the 16 ME items are shown in Figure 2. Some obvious differences could be witnessed among the devices. Taking the nine pumps as an example, some of the pumps consumed as much as >40 kWh electricity, such as GWP-3, while some consumed much less electricity, e.g., ChWP-1 and CWP-2. From the perspective of distribution, for some pumps, such as ChWP-3 and CWP-3, most of the energy consumption data were grouped near the maximal value. The high-energy-consumption data appeared hundreds of times more often than low-energy-consumption data did. In comparison, for other pumps, such as ChWP-1 and ChWP-2, energy consumption data were distributed more uniformly, while for GWP-1 and GWP-2, several different peaks could be seen in the distribution. The differences in energy consumption distribution among the ME items may provide inspiration for the recognition of variable-speed equipment. More detailed information on numerical analysis can be found in Section 3.2.

Numerical Characteristics of Energy Consumption Data
The numerical parameters of the energy consumption data for each ME item were calculated, and they are listed in Table 1. As seen, most dimensional parameters of the AHUs were generally the lowest among all the ME, while the two chillers presented most of the highest dimensional parameters. Additionally, sorting the devices on the basis of these dimensional parameters resulted in individual differences. When referring to dimensionless parameters, the lowest Cs and Cv and the highest Ck were found for AHU-5, indicating that the energy consumption data were highly concentrated near the maximal value. In comparison, for ChWP-1, ChWP-2, GWP-1, GWP-2, AHU-1, and AHU-3, Cs and Cv were much higher, while Ck was much lower, implying more uniform distributions. Table 1. Numerical characteristics of energy consumption distribution of 16 ME items in the case study. ChWP, chilled-water pump; CWP, condensing-water pump; GWP, glycol water pump; Ch, chiller; AHU, air-handling unit.

Numerical Characteristics of Energy Consumption Data
The numerical parameters of the energy consumption data for each ME item were calculated, and they are listed in Table 1. As seen, most dimensional parameters of the AHUs were generally the lowest among all the ME, while the two chillers presented most of the highest dimensional parameters. Additionally, sorting the devices on the basis of these dimensional parameters resulted in individual differences. When referring to dimensionless parameters, the lowest C s and C v and the highest C k were found for AHU-5, indicating that the energy consumption data were highly concentrated near the maximal value. In comparison, for ChWP-1, ChWP-2, GWP-1, GWP-2, AHU-1, and AHU-3, C s and C v were much higher, while C k was much lower, implying more uniform distributions.

Recognition of Variable-Speed Pumps
As seen in Table 1, the values of C v and C m allowed for dividing the pumps into two groups: one with comparatively high C v and C m , including ChWP-1, ChWP-2, GWP-1, and GWP-2, and another with comparatively low C m and C v , consisting of ChWP-3, CWP-1, CWP-2, CWP-3, and GWP-3. To highlight the two groups, a scatter matrix based on numerical parameters is shown in Figure 3. Figure 3 clearly shows that the nine pumps were divided into two groups, as shown in row 9 and column 5. The four red points in the top-right corner represent ChWP-1, ChWP-2, GWP-1, and GWP-2. This high value indicated that the energy consumption data of the four pumps were not concentrated at the extrema but were distributed fairly uniformly, which was in accordance with the operation characteristics of variable-speed pumps. Therefore, it was assumed that ChWP-1, ChWP-2, GWP-1, and GWP-2 were variable-speed ME, while the five other pumps were constant-speed ME.
With the information transmitted from the field, the assumption was then validated, indicating the successful recognition of variable-speed pumps by this study.
C m and C s could be used to individually distinguish the two groups of pumps, as seen in row 4 combined with column 5 and row 9, respectively, while other numerical parameters failed. However, the calculation of C m was much easier than that of C s . Therefore, C m may be a good parameter to be used as an indicator in the recognition of variable-speed pumps.

Verifiable Recognition of Variable-Speed AHUs and Constant-Speed Chillers
From the information transmitted on site, the five AHUs were operated with variable speed, while the two chillers worked with constant speed. Therefore, the recognition of AHUs and chillers could be further investigated to verify the applicability of the two indicators of C m and C s . The values of C m and C s of each of the seven devices are visually compared in Figure 4. Interestingly, C m could be used to distinguish variable-speed AHUs and constant-speed chillers by using a single threshold (0.09-0.30), whereas C s failed to facilitate recognition despite its applicability for pumps. The possible reason for this is discussed in Section 4.2.
The results presented thus far support the idea that proposed parameter C m could be a good indicator in the recognition of variable-speed equipment.  Table 1, the values of Cv and Cm allowed for dividing the pumps into two groups: one with comparatively high Cv and Cm, including ChWP-1, ChWP-2, GWP-1, and GWP-2, and another with comparatively low Cm and Cv, consisting of ChWP-3, CWP-1, CWP-2, CWP-3, and GWP-3. To highlight the two groups, a scatter matrix based on numerical parameters is shown in Figure 3.   GWP-2. This high value indicated that the energy consumption data of the four pumps were not concentrated at the extrema but were distributed fairly uniformly, which was in accordance with the operation characteristics of variable-speed pumps. Therefore, it was assumed that ChWP-1, ChWP-2, GWP-1, and GWP-2 were variable-speed ME, while the five other pumps were constant-speed ME.
With the information transmitted from the field, the assumption was then validated, indicating the successful recognition of variable-speed pumps by this study.
Cm and Cs could be used to individually distinguish the two groups of pumps, as seen in row 4 combined with column 5 and row 9, respectively, while other numerical parameters failed. However, the calculation of Cm was much easier than that of Cs. Therefore, Cm may be a good parameter to be used as an indicator in the recognition of variable-speed pumps.

Verifiable Recognition of Variable-Speed AHUs and Constant-Speed Chillers
From the information transmitted on site, the five AHUs were operated with variable speed, while the two chillers worked with constant speed. Therefore, the recognition of AHUs and chillers could be further investigated to verify the applicability of the two indicators of Cm and Cs. The values of Cm and Cs of each of the seven devices are visually compared in Figure 4. Interestingly, Cm could be used to distinguish variable-speed AHUs and constant-speed chillers by using a single threshold (0.09-0.30), whereas Cs failed to facilitate recognition despite its applicability for pumps. The possible reason for this is discussed in Section 4.2.
The results presented thus far support the idea that proposed parameter Cm could be a good indicator in the recognition of variable-speed equipment.

Feature Selection of Numerical Characteristics of Energy Consumption Data
The above-mentioned results demonstrated that Cm, a numerical characteristic of the energy consumption data of an air-conditioning system, can be used to recognize whether pumps, chillers, and AHUs operate with variable speed. To verify and assess its effectiveness, feature selection was

Feature Selection of Numerical Characteristics of Energy Consumption Data
The above-mentioned results demonstrated that C m , a numerical characteristic of the energy consumption data of an air-conditioning system, can be used to recognize whether pumps, chillers, and AHUs operate with variable speed. To verify and assess its effectiveness, feature selection was performed to compare the relevance of different numerical characteristics to determine whether the equipment worked with variable speed.
The feature selection method was originally based on a set of criteria or calculation methods to filter out feature fields for removal; then, the effectiveness of the remaining features was ranked relative to a specific target [46]. In this article, none of the 10 numerical characteristics mentioned above were removed. Instead, feature selection was used to find the most meaningful feature for the recognition and prediction of the target attribute from the 10 numerical characteristics of the energy consumption data. The feature selection considered one numerical parameter at a time to verify the performance of each parameter in recognizing and predicting the target attribute. The effectiveness value of each characteristic was calculated as (1 − p), where p is the p-value of an appropriate statistical test of association between the candidate characteristic and the target attribute. In this investigation, all numerical characteristics were continuous variables, while the target attributes were categorical variables. Therefore, the numerical characteristics were set as observation variables, target attributes were set as control variables, and p-values based on the F-statistic were used. The idea was to perform a one-way analysis-of-variance (ANOVA) F-test for each characteristic to test whether the different categories of Y (representing the control variables) had the same means as the different categories of X (representing the observation variables). The p-value based on the F-statistic was calculated as p-value is a random variable that follows F-distribution with degrees of freedom J − 1 and N − J, respectively; N is the total number of cases; J is the number of target-attribute categories, and where N j represents the number of cases with Y = j; x j represents the sample mean of characteristic X for target class Y = j; x represents the grand mean of characteristic X; x = J j=1 N j x j /N, and s 2 j represents the sample variance of characteristic X for target category Y = j. If the denominator in Equation (5) for a characteristic was zero, the p-value was set as 0 for that characteristic.
If the calculation result of (1 − p) was larger than 0.95, denoting that the observation variable had strong correlation with the control variable, the tested numerical parameter was deemed effective for the recognition and prediction of the target attribute, and vice versa. Table 2 lists the calculated ANOVA F-test p-values for the ten numerical characteristics ranked by their effectiveness relative to the attribute of variable-speed operations. The (1 − p) values of C m , E median , E mean , E mode , E range , and E sd reached or exceeded the threshold for "effective" (0.95). However, E median , E mean , E mode , E range , and E sd were dimensional characteristics; thus, the value strongly depended on the equipment and/or the system. For example, we could not infer or predict that equipment with different types, models, or specifications had the same target attribute only on the basis of the same values of E median , E mean , E mode , E range , or E sd for energy consumption data. Therefore, the remaining dimensionless characteristic, C m , was the only valid indicator that could be used for the recognition of variable-and constant-speed operation for different types of equipment in the present study. With the above results and discussion, the effective applicability of proposed characteristic C m was confirmed both practically and mathematically.

Construction Process of C m
Initially, we also attempted to address the issue of recognition by using theoretical reasoning by implementing the intrinsic correlations of operational data. Variable-speed ME can better adjust to state changes than constant-speed ME can, thus making its operation more energy-efficient. However, for actual air-conditioning systems, it is not possible to simply use energy consumption data to determine that the comparatively low-energy-consumption ME is running at variable speed, or the comparatively high-energy-consumption ME is running at constant speed. Even if there are two identical devices used in parallel, it is possible that constant-speed ME may simply run for a shorter time than other ME through on-off control, resulting in lower energy consumption. Similarly, variable-speed ME may simply run longer, resulting in higher energy consumption. Furthermore, the energy consumption comparison across different brands, types, and models of ME is much more complex. It is also the case in real air-conditioning systems that the same type of ME consumes the same amount of energy, which obviously makes it impossible to distinguish between constant-and variable-speed operating states by using a comparison of their energy consumption. In addition, due to different resistance or loads in the system (or in the loop within the system), coupled with the fact that the operation of ME may be influenced by the different thermal, vibration, and electromagnetic environments, as well as by the power quality at the workplace, it is difficult to use energy consumption data to determine constant-or variable-speed ME via theoretical reasoning. Therefore, this study attempted to address this issue by investigating the distribution of energy consumption data.
Theoretically, there are at least three typical scenarios for the distribution of energy consumption data. The first scenario is symmetrical distribution, such as standard normal distribution that rarely occurs in real data. Instead, energy consumption data in real air-conditioning systems are either positively skewed (i.e., with a positive value of C s ) or negatively skewed (i.e., with a negative value of C s ).
Apparently, the energy consumption distributions of constant-speed operating equipment (i.e., ChWP-3, CWP-1, CWP-2, CWP-3, GWP-3, Ch-1, and Ch-2) in Figure 2 typically presented a negative skew; thus, their C s values were negative, as listed in Table 1. However, it can also be seen from Table 1 that some variable-speed operating equipment (i.e., ChWP-1, ChWP-2, GWP-1, AHU-1, AHU-2, and AHU-5) had negative C s values. Thus, C s cannot easily and effectively distinguish variablefrom constant-speed ME. C s essentially measures the symmetry of distribution [47] or, more generally, characterizes the relationship between E mean and E median [48]. A positive C s indicates that E mean is larger than E median , while a negative C s implies the opposite. Additionally, a close-to-zero C s reflects the relative closeness of E mean and E median . Since all pumps were equipped on system-circulation loops, they tended to receive the same or similar linkage-control strategies/commands for operation to meet the system's flow-rate demands. Therefore, during the running period, the relationships between E mean and E median fell into two distinct groups according to variable-and constant-speed operations. This difference could be precisely captured by C s . Nevertheless, other ME, especially terminals such as the AHUs in this study, are mainly regulated on the basis of the cooling demands of specific spaces. Compared with pumps, AHUs are generally not controlled in linkage; thus, the relationships between E mean and E median are intricate and difficult to characterize using C s . This explains the disparity between the capability of C s in distinguishing variable-from constant-speed operations for pumps and other ME (AHUs and chillers).
As demonstrated in Figure 2, the energy consumption of equipment working with a constant speed was generally fixed, and the distribution of energy consumption was thus concentrated near maximal value E max over the running period. On the contrary, the distribution of energy consumption for variable-speed equipment was dispersed rather than focused around E max . Therefore, the difference between maximal value E max and the measure of central tendency, which was a simpler and more intuitive parameter, could compensate for the above shortcomings of C s . Furthermore, with respect to the selection of the indicator of central tendency, among the three options of mode, E mode , median E median , and mean E mean , actual energy consumption may have multiple modes, while the mean is sensitive to extremes (including outliers). Therefore, median E median , a better measure for skewed data like energy consumption data, was chosen to eliminate these adverse effects. Then, considering the different numerical scales of energy consumption for different types, models, or specifications of equipment, the previous E max -E median was divided by the range E range of energy consumption data to obtain dimensionless parameter C m . The process described so far is exactly what was investigated at the theoretical and practical level when proposing the coefficient of the median C m in the present study.

Scope of Application of C m and Contributions of This Study
The contributions of this research and applicability of C m are summarized as follows: • The successful recognition of variable-and constant-speed ME using the proposed C m in this paper addresses the research gap related to the reverse identification of equipment type through mining the energy consumption data of air-conditioning systems.

•
The involved ME in this study had diverse components and brands and thus exhibited different distributions and scales of energy consumption data. Nevertheless, since differences in energy consumption scales were eliminated and made dimensionless in the development of parameter C m , the method and results were not expected to be influenced by the specific brands, types, models, or power ratings of ME. Consequently, the proposed C m is supposed to have versatility in general air-conditioning systems other than the specific example of this study.

•
In addition, even if the constant-or variable-speed nature of the ME is already known, such an investigation could also diagnose if the variable-speed ME is working properly. In terms of diagnosis, by calculating the actual C m value for a certain period of time and comparing it with the C m value of the corresponding design conditions, it is possible to determine whether the operation of the ME is in accordance with the original design or to reflect on whether the design scheme is reasonable.

•
The distinguishing results can provide essential information for the subsequent analysis of energy/cost saving potential by optimizing algorithms for different types of ME under their respective practical constraints. Moreover, the proposed C m can also be of potential use to enhance the automatic processing of data, to reduce the direct involvement of field professionals, and to fundamentally support the automatic computer processing of massive amounts of air-conditioning system energy consumption data.

Limitations
However, this pilot study has the following limitations: • The data involved in this study came from the case system's energy consumption monitoring platform. Therefore, there were no corresponding data available on the platform, such as the water flow rate of pumps, air volume of AHUs, cooling capacity of chillers, or related frequency data of inverters, which could be employed to verify the recognition results of the present case. In addition, energy consumption data were collected at an interval of 1 h, which seemed somewhat long for analysis in this paper, since constant-speed ME might accomplish several on/off cycles or the variable-speed ME might change its speed of rotation several times. This was due to the local preferential tariff for the ice-storage air-conditioning system, where the minimal interval between two adjustments for the ME was 1 h, including on/off control or changes in operating speed related to changes in energy consumption [49]. Therefore, the above limitations had negligible impact on our study, as the probed results demonstrated the effectiveness of the proposed indicator parameter in the recognition of variable-speed ME.

•
The proposed coefficient of the median C m could effectively recognize variable-and constant-speed operation modes for ME in the current study. Theoretically, the value of C m ranges from 0 to 1. However, as stated in Section 4.2, the operation of real-world ME is influenced by the device and by the system to which it belongs, as well as by the thermal, vibration, and electromagnetic environments and the power quality of the environment in which it is located; hence, it is difficult to theoretically conclude a threshold value for recognition. On the basis of results to date, the confidence interval of the C m value for ME running at constant speed ranged from 0.07 to 0.13 with 99% confidence, while the confidence interval of the C m value for ME operating at variable speed ranged from 0.37 to 0.63 in this pilot study. Further study is warranted to collect more energy consumption data from different ME types and to theoretically investigate factors influencing C m so as to give a more precise threshold value.

Conclusions
This study collected the energy consumption data of ME in an air-conditioning system in a commercial building. Numerical analysis was conducted to investigate the data characteristics of different ME types and to recognize variable-speed components. An indicator parameter, coefficient of the median C m , was introduced. Results showed that the energy consumption data distributions of variable-and constant-speed ME were quite different. Coefficients of skewness C s and of the median C m could be used to distinguish variable-from constant-speed pumps. However, only proposed characteristic C m could be used to successfully distinguish all different types of ME in terms of constantand variable-speed operation. Moreover, the effective applicability of C m was further mathematically confirmed by using a feature selection method.
Regardless of the small sample size of this study, the successful recognition of variable-and constant-speed ME using the proposed characteristic in this paper addresses the research gap related to the reverse identification of equipment type. The results are also of potential use for enhancing the automatic processing of data, reducing the direct involvement of field professionals, and supporting the foundation for the automatic computer processing of massive amounts of air-conditioning system energy consumption data.