1. Introduction
Local natural gas distribution companies (LDCs) rely on accurate forecasts to facilitate decision making across various functions within their organization. Historically, natural gas forecasting models have been developed at a single level of aggregation [
1]. Temporally, models are designed for specific timeframes, e.g., hourly or daily. In the context of gas operations, geographical operating areas are often subdivided into smaller sub-regions or customer classifications and independently modeled. However, it is unlikely that the forecasts at all hierarchical levels are coherent; the underlying hours and days do not aggregate to the corresponding monthly value, and the aggregated sub-regions’ gas consumption do not aggregate to the total gas transferred through the system.
For illustration,
Figure 1 suggests incoherent temporal and spatial hierarchical levels in a natural gas distribution context within a single U.S. state. The scenario is incoherent spatially as the gas is consumed in the three different geographic sub-regions;
,
, and
do not aggregate to the
gas burned throughout the state. Temporal incoherence is evident in forecasted hourly values of
, and
, not aggregating across 24-h periods to equal their daily counterparts.
When disparities arise between hierarchical level forecasts, confidence in the forecasts drops, complicating both operational and strategic planning [
2]. This inconsistency, known as “forecast incoherence”, can lead to suboptimal decisions, inaccurate resource allocation, and an increased risk of operational disruptions, thereby undermining the LDC’s ability to meet the dynamic demands of their customers and maintain efficient operations [
3].
The primary research question addressed in this study is: How can hierarchical time-series forecasting be applied to natural gas distribution to ensure coherent and accurate forecasts across temporal and spatial hierarchies?
The motivation for using hierarchical forecasting in this context stems from the complexity, reliance on precise timing, and the interdependence of various components of a natural gas distribution network. Achieving aligned decision making is particularly challenging when demand forecasts are incoherent [
4]. LDCs rely on hourly, daily, and monthly forecasts for operational and strategic planning, often serving diverse customer bases across large geographic areas [
5]. A lack of coherence in gas demand forecasts can disrupt operations, customer relations, finances, and overall sustainability. Incoherent forecasts can lead to supply management disruptions, impacting public safety, electric grid stability, and regulatory compliance. Everyday decisions tied to supply procurement, inventory, and distribution scheduling heavily depend on accurate demand forecasts [
6]. Hence, incoherent forecasts can lead to poor resource allocation and operational efficiency choices.
The task of forecasting natural gas demand is frequently deconstructed into smaller components [
5]. By decomposing the challenge of supplying enough gas to the entire distribution network, forecasters can attain a better understanding of specific factors influencing gas demand, leading to more accurate and effective forecasting outcomes. This work shows how improvements can be made over state-of-the-art gas demand forecast solutions by arranging the forecastable components of gas distribution into a cross-temporal hierarchical structure to produce a new set of reconciled forecasts. Hierarchical time-series forecasting entails forecasting at every level within the hierarchy. Time-series reconciliation (TSR) is a framework to reconcile the forecasts subject to a set of aggregation constraints to produce coherent forecasts. The aggregation constraints selected in this work reflect a realistic spatio-temporal structure of an LDC gas operating area.
The contributions of this work lie in the implementation of a novel cross-temporal forecast reconciliation framework for natural gas demand. State-of-the-art gas demand forecast solutions are enhanced with time-series reconciliation techniques. Forecast reconciliation is an a posteriori refinement of existing forecasts—any technique or preferred style of estimation can be used to generate the set of initial base forecasts. Therefore, this work intends to pick up where many leave off, after forecast generation. We show how gas demand forecasting can be improved by using hierarchical time-series forecasting techniques by reviewing the state-of-the-art hierarchical framework and reconciliation techniques in
Section 2. Then, we establish the notation used in our reconciliation and form a hierarchical time series of natural gas data in
Section 3. Our cross-temporal forecasting framework is presented in
Section 4, and
Section 5 provides its performance analysis when applied to a noisy, real-world gas consumption data set.
2. Related Work
A hierarchical time series is a collection arranged by different aggregate levels. Forecast reconciliation is the process of adjusting incoherent forecasts to be coherent using hierarchical constraints [
7]. Hierarchical structures facilitate a comprehensive understanding of time-series data, capturing both individual series behaviors and their interactions within larger aggregates. There are two objectives of hierarchical time-series forecasting: (1) to improve forecasting accuracy at each level of aggregation, thus enhancing the granularity and reliability of predictions for individual series, and (2) to maintain coherence, ensuring that the sum of forecast values across hierarchy levels align with the forecast total of the aggregated series. This coherence ensures the forecasts are consistent across all levels of the hierarchy.
To address these objectives, various hierarchical time-series approaches have been developed. These approaches, generally categorized as cross-sectional, temporal, and cross-temporal [
8,
9], offer distinct strategies for reconciling forecasts across hierarchy levels while improving accuracy within each level. The gas distribution problem fits naturally into a hierarchical framework—with the total gas demanded being the aggregate-most level in the hierarchy. The gas demanded is subdivided in temporal and spatial partitions of gas consumption uniquely nested below. Early hierarchical methods employed a form of directional–structural scaling, which involves generating forecasts for a single gas demand series and linearly combining the forecasts to obtain demand estimate series for the other levels in the LDC hierarchy [
10]. The bottom-up (BU) forecasting framework [
11] generates gas forecasts at the lowest level of the hierarchy and sums them to produce forecasts for the higher levels, e.g., hourly forecasts aggregated to coherent daily forecasts. The top-down (TD) approach operates in the opposite direction, where the top level of the hierarchy is forecasted and disaggregated into the lower, more granular levels, e.g., total gas demanded disaggregated into operating areas
, and
[
12]. The middle-out (MO) approach takes a time series from the middle of the hierarchy and both aggregates and disaggregates the series into a coherent structure [
13]. The BU, TD, and MO approaches originate from decomposition and smoothing techniques in spatial econometrics [
14]. While these single-series methods are computationally efficient and produce coherent results, they do not consider the inherent correlation structure of the hierarchy.
The factors affecting gas demand at one hierarchical level are correlated with the gas demanded at other levels. Sánchez-Úbeda presents the first algorithm capable of balancing gas data using this correlation in their multi-horizon gas demand forecast decomposition [
15]. Hierarchical works progressed from single-level (BU, TD, and MO) to combination (COM) approaches to leverage dependencies between different levels [
3]. COM approaches [
9,
16,
17] are implemented such that they independently model and produce forecasts for all gas demand series in the hierarchy. The forecasts made at these levels are likely to be incoherent, but much more likely to be accurate than forecasts obtained via the aggregation or disaggregation of a single forecasted series [
18]. Numerous studies have concentrated on generating accurate gas forecasts at a single level (ignoring aggregation constraints) [
5,
6,
19]. As a result, COM methods use these specialized forecasts, considering both time and space, and reconcile all levels within the hierarchy simultaneously to attain overall coherence.
Hyndman et al. pioneered early work in optimal reconciliation methods by describing the statistical quantities of the BU, TD, and MO methods and identifying a minimum variance unbiased estimator to produce coherent forecasts [
16,
17]. Wang successfully applied the insights gained from optimal reconciliation to a grouped time series, finding an optimal weighted-least-squares solution (now known as variance scaling) [
7]. These methods summarize the correlations and interactions among the hierarchical levels linearly to optimally combine and reconcile the forecasts. Wickramasuriya [
16] and Athanasopoulos [
10] show how ad hoc adjustments can be incorporated into the optimal combination process using important covariates, sub-series trends, and domain knowledge. While ample empirical results exist to support the use of COM methods over structural-scaling approaches [
3,
9], early optimal reconciliation methods were limited to applications with smaller hierarchies either bound to the unit of analysis, such as geographical region and customer type, or to the unit of time [
2]. The gas distribution problem depends on adaptability across various hierarchical variables, such as time, geography, customer class, and average yearly usage. Consequently, it is crucial for gas demand forecasts to be coherent across the various hierarchical variables, including temporal and spatial dimensions.
Given that the physical process of natural gas distribution has both temporal and spatial aggregation constraints, the forecasts should also adhere to these constraints. Cross-sectional hierarchies are constructed from multiple contemporaneous time series, such as geographical divisions or customer groupings [
20] (
Figure 1). Cross-sectional methods focus on achieving coherence among various spatial elements or series within a specific context, reconciling data across different sections or units at a single point in time [
8]. Van Erven and Cugliari were among the first to apply a cross-sectional forecast reconciliation method on energy data [
3,
21]. Specifically, they implemented a Game-Theoretically Optimal (GTOP) reconciliation method for electricity demand data disaggregated into 17 tariff groupings. Bai and Pinson also propose a distributed reconciliation method based on the GTOP method in their application of day-ahead wind power forecasting [
22]. Gawel implements a cross-sectional global and local approach for gas consumption, specifically focusing the distribution infrastructure in Poland [
23]. However, they chose to reconcile cross-sectionally across the spatial dimension of their data. Cross-sectional reconciliation methods focus on achieving coherence between forecasts at different aggregates, but not across different frequencies.
Temporal hierarchies are constructed from one or more time series by means of non-overlapping temporal aggregation [
11]. Early examples of temporal aggregation were implemented to overcome limited memory availability by smoothing daily stock time series into weekly time series [
24]. Jeon, Panagiotelis, and Petropoulos concentrate on producing temporally coherent probabilistic electricity demand forecasts [
25]. Temporal reconciliation methods leverage time-series modeling techniques to refine forecasts at different frequencies [
2]. Theodosiou investigates how to combine independent, temporally incoherent forecasts using different deep learning architectures in their forecast refinement. Di Fonzo implements the closest industrial process to use hierarchical forecasting with photovoltaic power generation [
11]. Cross-temporal methods combine elements of both cross-sectional and temporal approaches, emphasizing the incorporation and evolution of temporal patterns in a coherent hierarchy [
26].
Cross-temporal reconciliation harmonizes and ensures consistency across temporal and spatial hierarchies [
17]. Kourentzes et al. demonstrate how to improve forecast accuracy by exploiting relationships in both the cross-sectional and temporal hierarchies in forecasting Australian tourism data [
8]. The tourism forecasting problem resembles the geographical and temporal constraints of natural gas distribution [
8]. Spiliotis offers a non-linear perspective of the problem of hierarchical reconciliation, incorporating the constraints of hierarchical time-series forecasting with machine learning forecasting techniques [
9].
Natural gas distribution represents a real-world forecasting problem that theoretically fits the hierarchical forecasting model but has not yet been explored. While theoretical works show the uses of HTS are common, particularly using the Australian tourism data set [
8], no published works, to our knowledge, demonstrate the effectiveness of applying hierarchical constraints to more dynamic problems, such as natural gas distribution. Our study addresses this gap by explicitly outlining how the cross-temporal hierarchy is formed for gas distribution, demonstrating how to effectively combine the spatial and temporal dimensions of the demand forecasting problem.
Hierarchical time-series forecasting offers a robust framework for handling complex hierarchies and taking advantage of dependencies between different aggregates, thereby providing more accurate and coherent forecasts, contributing to better planning, resource allocation, and decision making [
8]. We impose these natural aggregation constraints in a hierarchical time series of natural gas and introduce the notation used in our reconciliation in
Section 3.
4. Results
All published hierarchical forecasting techniques assume an underlying coherence is present in the training data. As summarized in the literature review (
Section 2), numerous researchers have found success in reconciling forecasts using coherence theory-driven optimal combination methods. A particular challenge faced in this work is that the underlying data used to train the base models are not coherent.
Figure 7 illustrates this incoherence between the summed hourly demand and daily demand for a sample month (January 2019).
Despite the incoherence visible in
Figure 7, we show that forecast reconciliation yields better forecasts than base forecasting models. We attribute this performance improvement to the combination of information from each hierarchical level [
2].
We use scale-independent error metrics accounting for the differences in scales across hierarchical levels to evaluate performance [
33]. Let the naïve model (NAV) produce incoherent forecasts
for both daily and hourly series. The naïve method produces forecasts equal to the last observed value and is frequently used as a benchmark against more sophisticated reconciliation techniques [
9]. The base forecasts (6) are included in the following performance analysis to compare the reconciled forecasts performed against the incoherent estimates. We evaluate the forecasting performance of the naïve, base, VS, and MinT methods in terms of accuracy and bias.
Forecast skill is measured as the mean absolute scaled error (
MASE) and root mean squared scaled error (
RMSSE). The mean absolute scaled error is:
The in-sample
is favored by Hyndman and Koehler for its consistent availability and effective scaling of errors [
33]. The root mean squared scaled error is:
The
serves as a standardized measure, providing an indication of the relative magnitude of errors by normalizing them based on the in-sample
. The absolute mean scaled error (
) is:
and minimizes errors using the median.
All measures are scale-independent, meaning averaging across series is possible [
9].
Table 1 compares these metrics and their averages across the cross-temporal hierarchy.
In
Table 1, lower values indicate better results. The best metrics for each operating area are in bold. The base forecasts have inherent incoherence, while VS and MinT rows display coherent results. Any value in
Table 1 exceeding 1.00 indicates that the naïve model outperformed the base forecast model (6). Since the base forecast models have no autoregressive terms, it is not surprising that the naïve one-step-ahead persistence model out-performs all hourly forecasts.
The MinT is the most accurate method in this study, with an average hourly
,
of
and daily
across all operating areas and temporal frequencies. Comparing these results to the base forecasts and VS reconciliation
and
Table 1 shows 10% hourly and 3% daily improvements when compared to incoherent base forecasts, and 7% hourly and 9% daily improvements when compared to coherent VS forecasts. Examining the
and
of each hierarchical level, MinT consistently demonstrates superior forecasting accuracy compared to base forecast and VS methods. This consistency shows the robustness of MinT in reconciling forecasts across different temporal and hierarchical levels.
In measures of bias, MinT does not outperform the VS technique, with average hourly and daily of . We found this observation to be interesting, considering VS is closely related to the implementation of MinT, except for that fact that the full covariance matrix of forecast errors produced in (9) is used in MinT and only the diagonal in VS. With more information on hierarchical interactions and effects, we expected the bias to decrease with the additional information used in MinT.
Figure 8 provides additional insights into the relative accuracy of each reconciliation method across gas operating areas
,
,
, and
.
Figure 9 shows the daily results for the same areas.
Natural gas demand forecasting performance is typically assessed using scaled-dependent or percent-error metrics evaluated at a single level. Both
Figure 8 and
Figure 9 show errors measured in dekatherms (Dth), which are calculated by taking the difference between actuals,
, and forecasts,
, for the naïve, base forecast, VS, and MinT methods. Unlike the relative out-of-sample error metrics in
Table 1, these results are scale-dependent. Across both hourly and daily plots, errors are distributed around zero, except for fourth quantile outliers. The VS and MinT residuals in
Figure 8 and
Figure 9 appear nearly identical. This occurs because the generalized least squares solution for MinT converges closely to the ordinary least squares solution for VS. This suggests that the ordinary least squares solution effectively summarizes most, but not all, intra-hierarchy interactions. A trade-off occurs between computational efficiency and information gain while working with large hierarchies. While the VS method is outperformed in terms of accuracy, it is computationally quicker to calculate than MinT. This trade-off must be considered, especially if reconciliation is carried out over large temporal hierarchies.
Figure 10 and
Figure 11 present another view of the unscaled residuals over the 2023 heating season.
Observing
Figure 10 and
Figure 11, the actuals are shown in black, BF in purple, MinT in orange, and VS in yellow. The corresponding dashed lines in the same colors indicate the differences between each method and the observed gas demand. Gas practitioners are particularly interested in accurate forecasts during peak winter periods [
23,
34,
35]. In
Figure 10, the orange peaks stand out, indicating that MinT reconciliation effectively leverages the correlation between exogenous weather variables and gas consumption in areas
,
,
, and
. This is, in part, due to the MinT method’s ability to model inter-hierarchy series interaction (via the inclusion of off-diagonals of
). The VS method performs similarly, but does not consider these interactions and does not perform as well on these peaks.
Figure 11 shows daily estimates with a similar pattern of orange peaks, though not as consistently. This suggests that the smoother, lower-frequency daily series benefited from MinT reconciliation, but not as significantly as the hourly results in
Figure 10. This also motivates further research into gas demand reconciliation and suggests that training specialized forecasting models for each hierarchical level could better leverage different exogenous correlations.
Given the inherent incoherence in the natural gas data, the results depicted in
Figure 8 and
Figure 9 show satisfactory performance when contrasting the coherent VS and MinT forecasts with the incoherent naïve and independently generated base forecasts. These findings underscore the applicability and effectiveness of MinT in improving forecasting accuracy in temporal and hierarchical contexts.
5. Discussion
We use cross-temporal forecast reconciliation methods tailored to the needs of local gas distribution. Most time-series reconciliation efforts focus on a particular temporal or cross-sectional hierarchy, raising doubts about the effectiveness of such methods in the context of gas delivery [
3]. We hypothesized that insights from single-dimensional approaches could apply to cross-temporal implementations with suitable aggregation structures. Cross-temporal reconciliation is made possible by organizing gas demand series into a hierarchical time-series structure, which is then used to constrain forecasts across both measurement resolution (time) and operating area (space), ensuring coherence. We propose two reconciliation schemes to transform incoherent natural gas base forecasts into coherent sets of gas demand forecasts. A case study is carried out focusing on natural gas demands across three operational regions, forecasting at various geographical levels, and analyzing both hourly and daily frequencies. Cross-temporal reconciliation using the MinT method results in a 10% improvement in hourly forecasts and a 3% improvement in daily forecasts compared to incoherent base forecasts. Additionally, MinT yields a 7% improvement in hourly forecasts and a 9% improvement in daily forecasts compared to coherent VS forecasts.
Future research on the cross-temporal reconciliation of natural gas demands will focus on integrating demand forecasting with other energy systems, like electricity and heat generation, to optimize energy management. Investigating regulatory and policy implications within the hierarchical constraints set by government regulators is also a realistic application. Additionally, analyzing different customer segments and their consumption behaviors can lead to tailored forecasting approaches for each segment. These ideas contribute to a more efficient gas distribution system, while also providing significant financial incentives to local distribution companies.
The superior performance of MinT and VS methods over base forecasts is attributed to combining information from all levels of the natural gas demand hierarchy. Future directions include exploring error propagation based on individual gas demand levels rather than the entire hierarchy, as well as adjusting for the natural incoherence of gas data before reconciliation. In summary, MinT’s capability to leverage hierarchical structures and ensure coherence across aggregation levels renders it superior for hierarchical time-series forecasting and reconciliation in gas demand forecasting.