1. Introduction
The demand for energy in electrified railway systems is enormous on a global scale. Therefore, adapting the energy structure and integrating new renewable energy sources (RESs) become crucial. The pursuit of sustainable development and optimising the use of RESs are becoming increasingly important not only in the context of railway vehicles but also in terms of modernising the power supply for the entire railway infrastructure. Numerous sources describing electric traction systems, as presented in [
1], where various aspects of the power supply of the traction system were discussed, covering both direct current and alternating current. Reference [
2] explored an energy storage system for the efficient recovery of regenerative braking energy (RBE) and the improvement of power quality in AC-fed railway power supply systems characterised by fault-tolerance capabilities. Examples of such studies also include references [
3,
4,
5]. Reference [
6] presents research on railway power supply systems, with particular emphasis on analysing the power factor in the context of safe and reliable operation of traction railway systems. Simulation results and analyses of the impact of various system parameters on the pantograph voltage profile are showcased, along with the determination of maximum power limits drawn by a single train. In [
7], the authors proposed the integration of the traction power supply system with photovoltaics (PV), using a back-to-back converter. The system was designed to efficiently utilise PV energy while simultaneously compensating for the excess capacity of the converters and reducing the negative sequence current. Reference [
8] presents the potential use of a photovoltaic system with an energy storage system to increase the power of railway objects beyond the grid consumption limit. The results suggest that adjusting the parameters of the system could lead to cost savings of 1.1 to 2.68 times, depending on the selected settings. However, the available sources are mainly limited to general aspects of the railway infrastructure power supply system, without delving into the specifics of the power supply system for railway signal boxes (SBs).
In [
9], the introduction of a fully electrified high-speed rail freight transport system was analysed in Europe, comparing it with road transport. It was shown that despite the higher costs, the rail emits significantly less CO
2. To achieve the goal of carbon neutrality [
10,
11,
12], it becomes essential to consider the energy structure of the entire railway infrastructure in the context of new RESs and the possibility of storing this energy. Analyses of energy consumption and carbon footprints in railway infrastructure focus mainly on rail vehicles (cargo and passenger transport) and energy processing and losses in the traction network. In [
13], an energy, exergy, and environmental analysis was conducted in the context of freight trains operated by diesel and electric locomotives, revealing that the carbon emissions of electric trains per unit of useful energy are relatively high due to energy generation. In [
14], a model of energy consumption was presented, which allows energy consumption in operational train projects and train simulations within intermodal transportation planning tools. Comprehensive research is also being conducted on the definition and measurement of energy efficiency in the railway sector. In [
15], the efficiency of carbon emissions in rail transportation across several provinces of China was estimated using a slack-based measure (SBM) model, accompanied by spatial correlation analysis. In [
16], the authors attempted to analyse various scenarios of the impact of future emission reduction policies on carbon emissions in rail transport. They developed a model to predict carbon emissions in the railway transport system, integrating a long- and short-term memory (LSTM) algorithm for time series data and using grey analysis to select key factors with greater correlation. In general, the ratio of theoretically minimal carbon dioxide emissions to actual carbon dioxide emissions at a given input and output is called carbon emission efficiency [
17]. With limited input factors, a higher efficiency of carbon emissions in railway transportation leads to higher economic outcomes or lower carbon emissions, supporting economic development and low carbon growth [
18].
A separate issue is powering the railway traffic control infrastructure. SBs require a specific power supply due to their critical importance for safety. Therefore, they are powered by at least two independent AC (alternating current) lines and a diesel generator. Computer systems have their own uninterruptible power supplies (UPS). In Poland, the power supply systems for the railway infrastructure are regulated by the guidelines outlined in Annex 3 of the Resolution of 30 December 2019 [
19]. The applied battery backup power supply must ensure access to energy for at least 2 h for all devices related to railway traffic control. Regarding access to power lines, these are guidelines that may not need to be met in certain situations. The most important aspect is ensuring power supply for critical elements, especially for track lighting signalling possible emergency situations. Ensuring two independent AC lines connected to two independent medium-voltage to low-voltage (MV/LV) stations is problematic because railway traffic control facilities are often located in areas far from urbanised areas. It is increasingly necessary to lay long-distance high-voltage lines that cover several kilometres to supply power to just one facility.
On the one hand, the requirements for low or zero emission power supply and, on the other, the need to ensure the delivery of electricity to railway traffic control facilities led to the initiation of a project to develop a new solution for the railway SB power supply. This solution aims to reduce the carbon footprint while ensuring the power supply for critical infrastructure. According to [
20], there are more than 2000 manually controlled railway crossings in Poland, and according to [
21], there are approximately 3000 railway traffic control facilities.
It is also worth noting that similar challenges related to efficient energy management exist in other sectors. A good example is described in Reference [
22], where it has been shown that energy demand profiles in residential and commercial buildings often exhibit a high degree of repeatability. The authors determined daily load curves for residential, commercial, and industrial customers based on field measurements conducted by the Utilities of Electric Energy of Sao Paulo State. The statistical analysis of load curves conducted led to the recommendation of representative load curves for properties according to the consumption range in the residential sector and based on type of activity in the commercial and industrial sectors. This means that specific energy consumption patterns can be observed during the day or week. For instance, in office buildings, energy demand is typically higher during working hours and lower outside of working hours. In educational institutions, such as schools, these patterns are linked to class schedules. In the case of commercial buildings, such as shopping centres, energy consumption is most often associated with store opening hours and shopping seasons. Identifying these patterns in energy consumption profiles is commonly used for energy management optimisation and the development of effective saving energy strategies. In [
23], the authors developed dedicated visualisation tools based on clustering algorithms, allowing small and medium-sized enterprises to understand these patterns of electricity consumption. This facilitated the identification of demand management strategies and the optimisation of production schedules. Meanwhile, in [
24], various clustering algorithms, such as modified “follow-the-leader”, hierarchical clustering, K-means, fuzzy K-means, and self-organising maps, along with data reduction techniques, such as Sammon maps and principal component analysis (PCA), were used to group customers with similar patterns of electricity consumption. Similarly, in the case of railway infrastructure, precise forecasting of energy demand is essential for efficient resource management. A crucial aspect of efficient and sustainable resource management in such an integrated system involves precise forecasting of energy demand, particularly in relation to controlling the state of charge (SoC), as demonstrated in [
25]. The SoC is most commonly forecasted on the temperature and input voltage using fuzzy logic methods, such as Mamdani fuzzy logic [
26]. Forecasting demand in such a system is exceptionally challenging because it depends on various factors, such as variability of weather conditions, railway traffic, and changes in the availability of RESs. Therefore, it is essential to develop advanced monitoring and data analysis systems capable of providing more precise forecasts of the energy demand of railway infrastructure.
The challenge revolves around aligning the functionality of these structures within the railway infrastructure with energy generation. However, in this context, it is not feasible to adjust the operation of these devices to be in sync with RESs or the availability of energy storage, as indicated in references cited in [
27]. Therefore, effective utilisation of RESs to reduce carbon footprints can be achieved through the implementation of local energy storage [
28,
29]. However, the battery capacity cannot be chosen indiscriminately as it will impact the carbon footprint of the entire investment. Therefore, the selection of battery capacity must be based on a thorough analysis of actual needs. Additionally, during installation operation, the use of the battery capacity should be optimised by employing energy demand forecasting. In such cases, short-term forecasting methods are commonly employed, such as time series methods, for example, simple autoregressive moving average (ARIMA) modelling [
30,
31]. However, the necessity to account for the variability of external factors, such as temperature and wind, for which the relationship with the load is not linear, poses certain challenges for these methods. However, analysing the correlation between these factors and the load can be helpful, enabling a better understanding of the interrelationships. Within our project, our aim is to identify indicators correlated with the energy demand profile in railway substations, which stem from the lack of available data on this topic in the global literature, particularly the absence of statistical analyses on the energy demand profiles for railway facilities.
2. Materials and Methods
The specificity of the railway SBs lies in the energy consumption profile of the railway traffic control devices, which is influenced by the railway traffic management procedures. There are two types of electrical devices in railway SBs, depending on the priority of the power supply. Devices for which power access is critical are railway traffic control devices. These include turnout drives, signalling lighting on tracks, communication and control devices, and level crossing drives. These devices must always be powered. The second, less critical group includes heating and air conditioning, indoor lighting, and other small household appliances.
2.1. The Electricity Demand in Railway SB Buildings
Railway SB buildings play a crucial role in the management of train traffic on railway lines. Depending on their operational methods, various types of SBs have different electricity demands influenced by their equipment, technology, and functioning.
Figure 1 illustrates selected real profiles of electric energy consumption measured in railway SB buildings. Electricity consumption data covers the period of September 2019–August 2022. The key characteristic of these profiles is the presence of measurement errors. Moments with missing measurement results can be observed; in many cases, a cumulative result is provided for the entire period when readings were not taken. Such an approach allows for an energy settlement in the SB but introduces certain complications, especially when used for analysis. Data gaps, especially when they are random or irregular, can affect the accuracy and reliability of the analysis. The cumulative result for the period of missing readings may obscure actual fluctuations in energy consumption or other parameters at specific times, making it challenging to identify patterns.
To understand the system behaviour and draw reliable conclusions, it may be necessary to apply advanced data imputation techniques [
32] to fill in missing values. However, even with these tools, measurement errors and missing data can be challenging for analysts and engineers involved in interpreting this information, and such an analysis is beyond the scope of this article.
Three characteristic energy demand profiles can be distinguished for railway SB buildings: the random profile, in which distinctive time periods are absent or difficult to determine; the seasonal profile, where consumption is determined by the time of the year; and the regular profile, in which there is no apparent seasonal variation in electricity demand. The SB profiles from
Figure 1 are assigned to the following groups:
Random—SB no.: 1, 4, 5, 11, 12;
Seasonal—SB no.: 2, 8, 9, 13, 14;
Independent—SB no.: 3, 6, 7, 10;
It should be emphasised that a change in the load profile can result from planned or unplanned events resulting for example from a change in operation, planned renovations, or a change in the power supply source. This type of profile is characterised by SB 10, which shows a change in the hourly value of energy at a time point. This may involve, for example, changing the type of light sources or the source of the method of preparing domestic hot water.
2.2. Preparation of Data for Analysis
For further analysis, two profiles from each group were selected. Profiles 1 and 5 serve as examples of random profiles, profiles 8 and 13 serve as examples of seasonal profiles, and profiles 6 and 7 serve as examples of independent profiles. The annual energy consumption of each railway SB is shown in
Table 1. It can be noted that the electrical energy consumption of railway SBs can exhibit a very wide range.
These profiles were corrected by removing missing and correcting inaccurate data, separately. They were then normalised according to the formula:
where:
—normalised energy in the
i-th hour,
—energy in the i-th hour,
—maximum energy value in the analysis period.
Data normalisation is one of the key steps in data processing aimed at adjusting various values to a common range. In this case, this process was conducted to facilitate the analysis and correlation of these data with the indicators that influence the profiles. In practice, normalisation enables the identification of patterns, trends, and relationships between different study elements, as previously demonstrated [
33,
34].
For example, without normalisation, comparing indicators with very different value ranges would be difficult and could lead to erroneous conclusions. Normalisation allows for a consistent treatment of all data, regardless of their original scale, which is essential for precise and consistent analysis. As a result, it is possible to correlate profiles with indicators and understand the influence of individual factors, thus allowing reliable conclusions to be drawn from the conducted research.
The corrected and normalised profiles for selected SB are presented in
Figure 2.
Figure 3 presents two-day hourly profiles of selected SBs during the winter and summer seasons. These hourly profiles are essential for determining the impact of various parameters on the variability of energy consumption, ultimately helping to develop an energy management system. The target system will be designed to maximise the utilisation of planned RESs and energy storage facilities. The detailed hour-by-hour breakdown is crucial for understanding how energy consumption patterns vary throughout the day and between seasons. When analysing these hourly profiles, it becomes possible to adjust the energy management system to better align with fluctuating energy demands, thus enhancing overall energy efficiency.
2.3. Methods of Analysing Data of Electricity Demand Profiles
The purpose of the article is to investigate and analyse the correlations between indicators and the characteristic profiles of railway SBs, specifically:
temperature—average hourly temperature in °C;
wind speed—average speed over an hour in m/s;
sunshine/cloud cover—octane, values in the range of 0 to 9;
precipitation—value for 6 h in mm;
day of the week—numeric values in the range of 1 to 7, with Sunday serving as the beginning of the week;
hour of the day;
month;
the previous day—hourly energy values from the previous day;
duration of the day—the relative length of the day from sunrise to sunset;
day/night.
Weather data were downloaded from the database of the Polish Institute of Meteorology and Water Management [
35] for the locations closest to the selected railway SBs.
Table 2 shows some example weather data. Time-related indicators were defined for Poland and Warsaw time, and example values are listed in
Table 3.
The range of variability of important coefficients is presented in
Table 4.
To understand how individual indicators influence the SB profiles, a deep analysis is necessary. For this purpose, the authors of the article employed statistical and mathematical techniques, such as correlation analysis [
36]. The application of these methods allowed for the identification of significant relationships and the development of models describing the interactions between the variables under investigation.
The conclusions drawn from this article are fundamentally important for the selection of RES installations and energy storage solutions. They help ensure the proper functioning of SBs and effective backup power, while simultaneously limiting the need for oversizing elements of the power system. Determining the impact and finding correlations between factors and energy consumption profiles also contribute to reducing CO2 emissions by considering the entire lifecycle of RES installations. The results obtained will be used to develop an energy consumption forecasting and management system, which is the goal of the green SB project.
To test the correlation between selected indicators and the profile, several methods were chosen to demonstrate the existence of a relationship. Two computational methods were selected: Pearson’s method, which measures linear relationships, and Spearman’s method, which measures monotonic relationships. The distance correlation test was also used. In addition, a graphical method—a scatter plot—was employed for data analysis.
Pearson’s method, which has a wide range of applications, was used to identify potential linear relationships. In [
37], the authors applied it to demonstrate the correlation between the accuracy of current transformers (CT) and changes in temperature and frequency. In [
38], Pearson’s method was used in the context of fuzzy picture sets to solve problems with multiple attribute decision making (MADM).
Pearson’s correlation coefficient is a numerical value ranging from −1 to 1 that determines the strength and direction of the relationship between the variables under investigation. Pearson’s correlation coefficient
r can be calculated using the formula [
30]:
where:
—values of the variable
x—independent,
y—dependent;
—mean values of the variables x, y.
Spearman’s method, like Pearson’s method, does not detect time-related dependencies but is much less sensitive to outliers. The method was used, for example, to assess the relationship between terrain parameters, such as elevation, slope, and curvature and flood characteristics in [
39], as well as in the context of data analysis and optimisation of wind speed prediction in [
40]. It is a nonparametric method that makes no assumptions about the distribution of variables. However, its limitation in this case is that it checks for monotonic relationships. Spearman’s correlation coefficient
rs can be calculated using the following formula:
where
is the squared value of differences between the ranks of corresponding feature values
x and
y, and
n is the number of data pairs.
The interpretation of the value of the coefficient rs is the same as for the Pearson correlation coefficient.
Another data analysis method that we applied is the scatter plot. This method is a type of graphical representation of the spread of the data used to show the relationship between two variables. In a scatter plot, the values of variables are represented as points on a Cartesian plane, where one axis represents one variable, and the other axis represents the other variable. This method is commonly employed, for instance, in the identification of objects in remote digital images within the red and infrared spectrum [
41]. In our case, we determine the trend of dependence of electricity consumption on the variable being studied, such as temperature. This method allows us to visualise the relationship between variables and identify outliers. The measure of fit is the R
2 value.
The last method used is the distance covariance test applied in the absence of monotonic dependencies [
42]. Distance correlation can also be used as a measure of dependence, for example, in meta-analyses [
43], and for vectors whose joint distributions belong to the class of Lancaster distributions [
44]. This method allows for determining the strength of the relationship between two nonlinear random variables. It differs from Pearson’s and Spearman’s correlations because it can detect nonlinear dependencies and operates in a multidimensional context. The result of the distance correlation ranges from 0 to 1, where 0 indicates the independence between the variables x and y, and 1 indicates that the variables are the same.
The advantages and disadvantages of the selected methods are summarised in
Table 5, while their applicability scope is presented in
Table 6.
3. Results
In the context of the global climate change and the increasing dependence on RESs, understanding the correlation between energy demand profiles and weather conditions becomes crucial. Multiple studies and analyses indicate that atmospheric conditions, such as temperature, wind speed, and sunlight, directly influence energy consumption. Additionally, factors, such as seasons, specific days of the week, and time of a day, also affect the energy demand profiles.
The analysis explored the correlation between energy demand profiles and selected factors related to weather conditions and time-related variables. Incorporating these variables into the analysis allowed understanding and predicting the dynamics of energy consumption. The ultimate goal is not only a better forecast of demand, but also the optimisation of energy systems and the associated necessary investments to meet supply and demand challenges in a rational manner.
The article demonstrates the use of methods described in
Section 2.3 using temperature as an example. For the remaining indicators, only the analysis results obtained in a similar manner are presented.
3.1. Detailed Results Obtained for Temperature
The detailed presentation covers the methods used for analysing hourly electricity consumption data, focusing on the independent variable temperature. Temperature was chosen for the presentation of the method because the results obtained vary significantly for the selected methods and boxes.
Table 7 presents the results of the correlation coefficients for temperature.
Figure 4 illustrates the results of the scatter plot method. These graphs enable a visual assessment of the repeatability, trends, and dependencies between both variables.
3.2. Results Obtained for Correlation Analysis
4. Discussion
In the context of research on the impact of various environmental factors on energy consumption profiles, correlation analysis is an essential tool that supports the selection of factors for further analysis and forecasting of demand profiles. The obtained results provide valuable information on how specific variables, such as temperature, cloud cover, precipitation, or daylight hours, can influence energy consumption in different SBs. The influence of individual factors on the demand profile has been described in the order of the studies conducted.
Known correlation analysis techniques were used for the analysis. However, these techniques yield satisfactory results when assessing independent variables to predict electricity demand. The analysed measured electricity consumption profiles in railway SBs are not repeatable, unlike other facilities, such as commercial buildings, educational institutions, or offices, whose profiles are typically repetitive. The diversity of demand profiles in railway SBs indicates that different processes are occurring within them, although their operational objectives are similar. Consequently, the adaptation of RES devices and energy storage units cannot be standardised based on factors, such as power demand or annual energy consumption, but must be tailored to the specific shape of the consumption profile. This diversity requires a departure from the typical approach of standardising RES supply systems and energy storage units according to predetermined power demand or annual energy consumption patterns. Instead, it requires a nuanced understanding of the unique consumption profiles and the application of appropriate forecasting methods to accurately match the demand. When reviewing scientific publications, we did not find similar analyses for a specific consumer group as railway SBs.
Similarly to [
45], there is a significant impact of temperature variability on electricity consumption despite different locations. This article uses similar analysis methods, but they were used for different types of objects. SBs are characterised by significant systematics and continuity of operation in relation to residential, industrial, and office buildings. The increase in energy consumption in SBs depending on the temperature is visible in buildings with installed electric heating. Another factor is the time of day and the length of the day because it is necessary to ensure adequate light intensity throughout the facility.
The energy consumption is very similar to that for the previous day energy profile. This is due to the continuity of time of consecutive days and, at the same time, the continuity of weather conditions, human behaviour, and other factors. This means that there is a significant similarity between consecutive days and that the variability of conditions is only caused by the time of day. However, in some buildings, there is a visible relationship over long periods and seasons. Combining time series methods with artificial networks algorithms may allow for better results [
46]. Due to the nonlinear properties of time series demand profiles, the methods used are not always effective in short-term energy demand forecasting [
47]. Time series methods and correlation analysis allowed the identification of indicators useful for more advanced methods, e.g., artificial intelligence. It is also impossible to clearly determine which indicators will be useful due to the character profile of demand. Therefore, time series analyses must be performed for each analysed profile. Detailed results are presented below.
4.1. Temperature
Temperature has a significant impact on values in most of the boxes, especially in SBs 7, 8, and 13, where strong negative correlations are observed. This means that an increase in temperature is associated with a decrease in values in these boxes. SB 1 is the least sensitive to temperature changes, while SB 8 is the most sensitive.
4.2. Wind Speed
The data analysis suggests that the relationships between wind strength and the consumption profiles of different SBs are mostly low to moderate. Among all the cases analysed, SB 7 shows the most distinct, albeit still low, negative correlation. For SBs 5 and 6, the correlation is even smaller; however, the distance correlation method, where the correlation is highest, indicates some weak dependency.
4.3. Cloud Cover
Cloud cover has the greatest impact on SB 7. For SBs 1, 8 and 13, the correlation values in all methods are very low, similar to SBs 5 and 6. Obtained results indicate some impact of cloud cover on the profiles, but it is low.
4.4. The 6 H Precipitation
The results suggest a very low impact of 6 h precipitation on the data in each of the SBs, although SBs 5 and 6 show slightly higher correlation values. This suggests that there is no strong relationship between precipitation and the data in these boxes.
4.5. Days of the Week
After analysing the correlation values for different SBs in the context of the days of the week, we notice that these values are extremely low. Regardless of the analysis method or the specific SB, the correlation indicators do not indicate any clear relationship between the consumption profile and a specific day of the week.
4.6. The Hour of the Day
Analysing the correlation between the data and specific hours in individual SBs, it can be observed that for most SBs, the correlation coefficients are low, indicating a minor influence of the hour on the presented data. However, the distance correlation method shows moderate correlation values, especially for SBs 6 and 7, suggesting some influence of the hour, although not a strong one. Therefore, the hour is a factor that may have an impact on forecasting but cannot be considered as a standalone indicator.
4.7. The Month of the Year
For each SB, the correlation values are different; however, in most cases, the rates are low to moderate (SBs 8 and 13). The distance correlation method often shows the highest correlation compared to other methods. However, in general, the results suggest that the impact of the month on the presented data is limited.
4.8. Energy Consumption from the Previous Day
The values of the correlation coefficients in relation to the energy consumption of the previous day have different trends. SB 6 has the highest correlation of all SBs tested, suggesting extremely consistent and predictable energy consumption from day to day. SB 5 generally shows lower correlation rates, which may indicate a greater irregularity in energy consumption compared to the previous day. The remaining SBs 1, 7, 8 and 13 fall between these two extremes, with varying degrees of correlation, suggesting varying degrees of stability in energy consumption. Regardless of the analysis method, these results highlight the existence of varying energy consumption patterns across railway SBs, with some showing greater predictability than the others.
High values of correlation coefficients for analysis with consumption from the previous day show that shorter periods of historical data should be used for forecasting. Profiles often lack seasonal repeatability.
4.9. Length of the Day
With respect to day length, we observe a clear variation in the results. For SBs 8 and 13, the correlations are extremely strong and negative in most methods, suggesting that the longer the day, the lower the energy consumption at these set points. SBs 6 and 7 show moderate negative correlations, indicating some degree of relationship between day length and energy consumption. However, the relationship is not as strong as that noted in the previous cases. SBs 1 and 5 have low correlation indices, suggesting that day length has little or no effect on energy consumption in these set points.
The results indicate that the correlation is low for profiles with random consumption, moderate for the independent profile, and strong for seasonal profiles for which day length is important.
4.10. Day/Night
Analysing the correlation values in the context of day and night, it can be observed that SBs 6 and 7 exhibit strong negative correlations in most methods, suggesting that an increase in daylight leads to a significant decrease in energy consumption in these SBs. A similar, though slightly weaker trend is visible for SBs 8 and 13. In the case of SB 5, although a negative correlation is also observed, it is stronger in the Spearman’s and distance correlation methods, indicating that this relationship might be more nonlinear than in other SBs. For SB 1, correlations are relatively low in all methods, indicating a lack of a clear relationship between day length and night and energy consumption.
For most examined SBs, a negative correlation between daylight and energy consumption is observed, although the strength of this relationship varies depending on the specific box. SB 1 appears to be an exception to this rule, showing a slight correlation indicating a lack of a clear dependency.
4.11. Complementarity Analysis
It is also important to discuss the results considering the complementarity of the selected analysis methods. For example, if Pearson’s correlation indicates a weak relationship but R2 is high, it may suggest a nonlinear relationship between the variables, which is worth exploring using other methods. Similarly, if the Spearman’s correlation is strong but R2 is low, it may indicate a significant nonlinear relationship. If the Pearson’s correlation is high but Spearman’s correlation is low, it might suggest a strong influence of outliers. The Spearman’s correlation can indicate the presence of a monotonic relationship, while the distance correlation is more suitable for capturing the cyclical nature of the relationship. Using Pearson’s correlation in conjunction with distance correlation allows for a better understanding of the nature of the relationship, considering both linear and nonlinear dependencies. Spearman’s correlation is more robust against outliers compared to Pearson’s correlation. If both methods indicate a relationship, it suggests correlation between variables. The R2 of the scatter plot can demonstrate how well a linear model fits the data. However, if the relationship is more complex, distance correlation can indicate a relationship even when R2 is low. Using Pearson’s correlation along with a scatter plot will not only help identify linear trends, but also highlight potential anomalies or exceptions. A scatter plot with R2 allows visualisation of the relationship, while the Spearman’s correlation emphasizes its monotonicity.
Based on the complementary analysis, the following observations can be made:
4.12. Results for SB 10
The verification of the obtained results was specifically conducted for SB 10 (
Figure 5), which is characterised by a large change in the demand profile caused by renovation works. Such a profile will allow for a good verification of the conclusions from the conducted research.
The results of the analysis examining the impact of various factors on the energy consumption profiles of SBs are comprehensively detailed in
Table 17.
Taking into account the complementarity analysis for SB 10 as collected in
Table 17, the following conclusions can be drawn.
For the factor of energy consumption from the previous day, the energy consumption in SB 10 shows very strong positive correlations across all methods, which is a clear signal that historical data can be an effective predictor of current energy consumption. The high Pearson’s correlation shows a linear dependency, and the equally high Spearman’s and distance correlation suggest that this relationship is both strong and stable regardless of the methodology, confirming the monotonicity and nonlinearity of the relationship.
The day/night variable shows significant negative correlations in both Pearson’s and Spearman’s methods, suggesting that energy consumption is higher at night than during the day. Higher R2 and distance correlation values also indicate the significance of this relationship, which may include nonlinearities not captured by a simple linear analysis.
The length of the day has moderate negative correlations indicating the influence of this factor, but the relationship may be more complex than a simple linear model.
Temperature and wind speed show moderate negative correlations, suggesting that they may influence energy consumption, but the relationship is not strong and may be nonlinear.
The month of the year shows a slight positive correlation, which may indicate a minor impact of seasonality on energy consumption.
Cloud cover and precipitation show very low correlation values, suggesting a lack of significant direct impacts of these factors on energy consumption.
Days of the week and the hour of the day have minimal or no correlations, indicating that they are not significant predictors of energy consumption in this instance.
Analysis of energy consumption in SB 10 can utilise the data from the previous day for forecasting. At the same time, the impact of the length of the day and the day/night cycle should be considered, which may require more complex models for an accurate prediction. Other magnitudes can be omitted from predictive models. It should be emphasised that the effectiveness of forecasting data for SB-type buildings is high if the previous day’s data is considered, allowing for effective forecasting, even in situations of high variability in SB operations. However, unforeseen changes in load on subsequent days cannot be used for forecasting.
5. Summary
The temperature has a significant impact on the values in the SBs, especially in SB 7, 8 and 13, which are characterised by strong negative correlations. An increase in temperature leads to a decrease in values in these boxes. SB 1 is the least sensitive to temperature, while SB 8 is the most sensitive. In terms of wind strength, most SBs show low to moderate dependencies. However, SB 7 stands out with the most distinct negative correlation, although it is still low. Cloud cover has a limited impact on the values in most SBs, with SB 7 showing the strongest dependencies. SBs 1, 5, 8, 13, and 6 exhibit very low correlations. Here, 6 h precipitation generally has a small impact on all SBs, but SBs 8 and 13 show slightly higher correlations. The analysis of days of the week did not reveal any clear dependencies. Regarding hourly analysis, correlation coefficients are generally low; however, the distance correlation method suggests some influence on SBs 6 and 7. The month coefficient has a limited impact on the values in SBs, although different SBs show various correlation indicators. The analysis of the energy consumption of the previous day indicates varied patterns in different SBs, with SB 6 showing the highest correlation and predictability. In the context of the duration of the day, SBs 8 and 13 are characterised by exceptionally strong negative correlations, suggesting a significant impact of the duration of the day on energy consumption. For SBs 13 and 7, there is a certain degree of dependency. However, for SBs 1 and 5, the influence is minimal or absent. When analysing day/night, most SBs show negative correlations, suggesting that increased daylight leads to a decrease in energy consumption. The strength of this relationship varies depending on the SB, with SB 1 showing a small or no correlation.
In the case of many SBs, distance correlation indicates potential nonlinear relationships, even if Pearson’s and Spearman’s correlation methods suggest weak connections. This is especially noticeable in the temperature analysis for SBs 1, 5, and 6 and in the analysis of wind speed analysis for SBs 7, 8, and 13. In some analyses, such as cloud cover or day length, differences between Pearson’s and Spearman’s correlations may suggest the influence of outliers on the results. Precipitation analysis and days of the week do not show significant relationships for any of the SB objects, suggesting that these variables could not significantly impact the analysed signals.
In the analysis of the previous day, high values of correlation coefficients were observed for all correlation methods, indicating a significant relationship between signals in this category. Potential nonlinearities were observed in the analysis of hours and months, especially in SBs 7, 8, and 13. In the analysis of the day/night variable, potential nonlinear dependencies or the influence of outliers are observed, suggesting that the change from day to night (or vice versa) could significantly impact the analysed signals.
The analysis of SB 10 was conducted as a verification of the research method. The data indicate that historical energy consumption patterns and the day/night cycle are significant predictors of current energy use. Strong correlations across various statistical methods substantiate the predictive power of these factors. Although temperature and wind speed exhibit some correlation, their influence is less straightforward and potentially nonlinear. Seasonality and environmental variables, such as cloud cover and precipitation, show limited predictive utility. The findings emphasise the utility of historical and diurnal patterns over meteorological factors in the predictive modelling of energy consumption for SB 10. However, unexpected load changes in the following days disturb the forecasts.
6. Conclusions
In conclusion, various environmental and time-related factors have diverse impacts on energy consumption in different SBs. Many of these factors have a small to moderate influence on SB values; however, certain SBs, such as SBs 3, 4, 5, and 6, often exhibit clearer dependencies in response to specific factors. Forecasting energy consumption in different SBs requires taking these dependencies into account, as well as potentially shorter forecasting periods, especially regarding consumption from the previous day.
Temperature has a distinct impact on the values in most SBs. An increase in temperature is associated with a decrease in values, especially in SBs 4, 5, and 6. SB 5 is the most sensitive to temperature changes. The energy consumption of the previous day indicates the existence of varied energy consumption patterns. SB 3 has the highest correlation, suggesting consistent and predictable day-to-day energy consumption. Consumption profiles often lack seasonal repeatability, except for SBs 5 and 6, where correlations with day length are exceptionally strong and negative, indicating that longer days lead to lower energy consumption. For SBs 3 and 4, strong negative correlations regarding day/night indicate a decrease in energy consumption with an increase in daylight. A similar trend is observed for SBs 5 and 6. In summary, temperature, consumption of the previous day, day length, and day and night periods have the most significant impact on energy consumption, but the extent of this impact varies depending on the specific SB.
Taking into account the types of profiles, the analysis showed that factors influencing good correlations vary. The analysis based on the profile types can be summarised as follows:
SBs 1 and 2 (random): In these buildings, there are no clear correlations or repeatable patterns of energy consumption from day to day. External factors, such as temperature, day length, or the previous day, might have a limited impact on energy consumption in these boxes due to randomness of consumption. Forecasting and managing energy in SBs 1 and 2 could be more challenging compared to other cases.
SBs 3 and 4 (independent): Although these SBs do not exhibit seasonal repeatability in energy consumption, they appear to be more predictable compared to SBs 1 and 2. SB 3 has an exceptionally high correlation with the energy consumption from the previous day, indicating a certain dependency on energy consumption. However, changes in temperature, day length, and other external factors might influence the energy consumption, although not necessarily in a seasonal manner.
SBs 5 and 6 (seasonal): These SBs show clear correlations with seasonal factors, such as temperature and day length. For example, SB 5 is particularly sensitive to temperature changes. This suggests that energy consumption in these boxes fluctuates cyclically throughout the year, allowing better forecasting and management of energy consumption in SBs 5 and 6 based on weather or seasonal forecasts.
The complementary analysis revealed diverse relationships between the signals and the coefficients studied. Nonlinear relationships exist in several categories, especially where the distance correlation indicates stronger connections. Outliers can impact results in certain categories (such as temperature, wind speed, and cloud cover), as indicated by differences between Pearson’s and Spearman’s correlations. Furthermore, variables such as precipitation and days of the week do not show significant associations with the signals. In the temporal analysis, particularly concerning the previous day and the day/night variable, strong and potentially nonlinear dependencies were observed.
The most serious limitations of the method used are its sensitivity to data incompleteness. The lack of data strongly affects the quality of the results obtained. Extreme values should also be avoided, e.g., values resulting from measurement errors. The method also appears to be sensitive to the length of the period considered. Depending on the factor taken into account, it seems necessary to adjust the length of the time series from which the correlation coefficient is calculated. A time series that is too long may have a negative impact on the result obtained. This will be the next stage of research.