Reducing Computational Load for Mixed Integer Linear Programming: An Example for a District and an Island Energy System

: The complexity of Mixed-Integer Linear Programs (MILPs) increases with the number of nodes in energy system models. An increasing complexity constitutes a high computational load that can limit the scale of the energy system model. Hence, methods are sought to reduce this complexity. In this paper, we present a new 2-Level Approach to MILP energy system models that determines the system design through a combination of continuous and discrete decisions. On the ﬁrst level, data reduction methods are used to determine the discrete design decisions in a simpliﬁed solution space. Those decisions are then ﬁxed, and on the second level the full dataset is used to ex-tract the exact scaling of the chosen technologies. The performance of the new 2-Level Approach is evaluated for a case study of an urban energy system with six buildings and an island system based on a high share of renewable energy technologies. The results of the studies show a high accuracy with respect to the total annual costs, chosen system structure, installed capacities and peak load with the 2-Level Approach compared to the results of a single level optimization. The computational load is thereby reduced by more than one order of magnitude. The microgrid optimization with full-time series (reference case), time series aggregation and the 2-Level Approach show the same technology structure. To meet the electricity and heating demand, the microgrid optimization opts for boilers, CHPs, heat storages and PV as supply technologies. The investigation of all installed supply technologies in the district is considered in the following paragraphs. M.R. Conflicts of Interest: The authors declare no conflict of interest.


Introduction
Urban districts are in search of strategies to reduce greenhouse gas (GHG) emissions. The strategies should be cost-and environmentally optimal solutions for future district energy supply systems. Microgrids have the potential to contribute to GHG reductions in urban districts because of their higher efficiency due to energy utilization synergies [1].
In the literature, microgrids are usually divided into microgrids with grid connection to the macrogrid and island microgrids without a connection to a higher grid level [2][3][4]. A microgrid can include multiple residential and commercial buildings with specific electricity, heating and cooling demands. These are supplied by conventional technologies like boilers, distributed energy resources (DERs) such as photovoltaics (PV), energy storages technologies like batteries, and different energy distribution grids for electricity, natural gas and district heating [5][6][7]. The advantage of microgrids is the cost-optimal operation of supply technologies through exchanging the energy between the buildings and storing it [8].

Research Objective
In this paper, we present a new approach to decreasing the complexity of MILP optimization models by splitting the optimization problem into two levels. The new 2-Level Approach is developed for MILP models for the design and operation of energy systems. On the first level, the structure of the energy system is determined with time series aggregation. The scaling of the chosen technologies and the operation optimization is conducted on the second level.
In Section 2, we present the applied methodology of the energy system optimization, as well as a detailed description of the new 2-Level Approach. In Sections 3 and 4, two case studies and their general input time series data, which is required for microgrid optimization, are presented. The first is a case study with six buildings in a spatially distributed network and the second is a single-node island system with a high share of renewable energy and limited fossil fuels-based electricity production. These case studies were chosen to investigate the impact of the proposed 2-Level Approach on both distributed energy systems with several identical units in different places and single-node energy systems with several rival, but different supply and storage technologies and a focus on the seasonal storage capacity usually required by highly renewable energy systems. In Section 5, the results of the case studies are analyzed and discussed.

Methodology
In the following sections, the optimization model (Section 2.1), the new 2-Level Approach (Section 2.2) and the scenario definition for both case studies (Section 2.3) are described.

Energy System Optimization Model Formulation
The energy system models are formulated as MILP, as presented in Welder et al. [34] with the FINE optimization framework [35]. Furthermore, we applied the time series aggregation module (TSAM) [36] for the new 2-Level Approach based on the FINE framework. The objective function of the optimization models is to minimize their total annualized cost (TAC). A detailed formulation of the energy system model can be found in Welder et al. [35].
The cost structure of the technology portfolio is represented by fixed and capacity-related CAPEX and OPEX costs. The fixed costs are modelled with a binary variable that determines if the technology is installed in the microgrid energy supply system or not.
In the first case study, every building in the microgrid is represented by a node. Hence, the spatial resolution depends on the number of investigated buildings. In the second case study, all technologies are located in one node and have a limited capacity for a supply based on fossil fuels. The temporal resolution is one hour and depicts one year with 8760 time steps.
As the mathematical solver, we used Gurobi 8.0 (first case study) and Gurobi 8.1 (second case study). Pyomo is used as the modeling language.

2-Level Optimization Approach
In the following, a new 2-Level Approach is presented to reduce the complexity of optimization problems and increase the accuracy of the results compared to time series aggregation alone.
A given optimization problem is divided into two levels that are consecutively solved with the FINE model, as shown in Figure 1. Furthermore, other optimization frameworks can be used to apply the 2-Level Approach.
On the first level, the optimization specifications are defined with time series data such as weather conditions, as well as the load profiles and techno-economic parameters of the considered technologies to build the microgrid energy system model in FINE. Furthermore, the time series aggregation (TSA) specifications with typical periods and the cluster method are specified. The storage operation between the typical periods is enabled by a superposition of system states [37]. The typical periods represent the full time series by clustering a set of similar periods around a set of typical periods (typical days). In this study, we use different numbers of typical days to represent the full time series. A low number of typical days (e.g., 5 typical days) typically leads to better computing performance compared to a high number (e.g., 40 typical days). On the other hand, the accuracy of the optimization results decreases with a decreasing number of typical days [28].
Different clustering methods such as k-means, k-medoids or hierarchical aggregation are common to cluster time series. For the first level, a hierarchical clustering algorithm [38] is used, as it is easily reproducible and maintains a higher variance of the input time series by representing the clusters with their medoid.
With this input of optimization and TSA specifications, the first level optimization problem is solved in the FINE model. The results represent the cost-optimal design and operation of the energy supply system for the aggregated time series. In the next step, the optimal installed capacities and operation of technologies are determined, with only the technology structure of the microgrid energy supply system as binaries is used for the second level. On the second level, the installed capacities of fixed technologies (technology scaling) and the optimal system operation are determined with the full time series based on the set of binary variables (fixed technology structures) divided from the aggregated optimization on the first level. Hence, the MILP is reduced, with the given binaries, to a linear program. As a result, the optimal design and operation of energy supply system with higher accuracy compared to a time series aggregation is obtained because of the optimization with the full time series in the second level of the 2-Level Approach.
Furthermore, it should be mentioned that the proposed method could also be applied to components with piecewise linear cost functions, i.e., with more than one binary variable per component. (fixed technology structures) divided from the aggregated optimization on the first level. Hence, the MILP is reduced, with the given binaries, to a linear program. As a result, the optimal design and operation of energy supply system with higher accuracy compared to a time series aggregation is obtained because of the optimization with the full time series in the second level of the 2-Level Approach. Furthermore, it should be mentioned that the proposed method could also be applied to components with piecewise linear cost functions, i.e., with more than one binary variable per component.

Scenario Definition
To validate the methodology, the design and operation of the energy system is first determined by the new 2-Level Approach, and then by a single problem with an aggregated time series. These are compared to the reference case, which is computed with a full-time series, and no aggregation techniques are applied. For the 2-Level Approach and the time series aggregation, typical periods consisting of 5, 10, 20 and 40 typical days are considered in hourly resolution (see Table 1).

Scenario Definition
To validate the methodology, the design and operation of the energy system is first determined by the new 2-Level Approach, and then by a single problem with an aggregated time series. These are compared to the reference case, which is computed with a full-time series, and no aggregation techniques are applied. For the 2-Level Approach and the time series aggregation, typical periods consisting of 5, 10, 20 and 40 typical days are considered in hourly resolution (see Table 1). The scenarios of both case studies are totally identical in order to maintain consistency and enable a comparison between small single-node models with a high relevance of seasonal storage and more complex and distributed energy systems. This means that on the first level of the approach, the input data is sorted into daily duration curves and aggregated with Ward's [39] hierarchical clustering algorithm for 5, 10, 20 and 40 typical days (i.e., clusters). The energy systems of both case studies are individually optimized for each of the three defined scenarios, e.g., the 2-Level Approach, the time series aggregation and the reference case. The results of the TAC, computing time and installed capacities are used to compare the scenarios.

First Case Study-Urban Energy System
An exemplary district is investigated with six multi-family houses and six households in every individual building. The district is located in Germany. The buildings have similar building parameters and are connected by an electricity grid and a natural gas network. In Section 3.1, we will give an overview of the underlying data, such as the total energy demand of the considered district. Furthermore, the technology portfolio is presented in Section 3.2. Finally, in Section 3.3, the results of the first case study are investigated.
The technology-specific CAPEX and OPEX costs are composed of heat generation technologies, DER, storages, the purchase of electricity and natural gas minus the revenues of feed-in by PV and CHP generation. Figure 2 provides an overview of several components that contribute to the total annual costs of the microgrid energy supply system.   --LP  --Full-Time Series The scenarios of both case studies are totally identical in order to maintain consistency and enable a comparison between small single-node models with a high relevance of seasonal storage and more complex and distributed energy systems. This means that on the first level of the approach, the input data is sorted into daily duration curves and aggregated with Ward's [39] hierarchical clustering algorithm for 5, 10, 20 and 40 typical days (i.e., clusters). The energy systems of both case studies are individually optimized for each of the three defined scenarios, e.g., the 2-Level Approach, the time series aggregation and the reference case. The results of the TAC, computing time and installed capacities are used to compare the scenarios.

First Case Study-Urban Energy System
An exemplary district is investigated with six multi-family houses and six households in every individual building. The district is located in Germany. The buildings have similar building parameters and are connected by an electricity grid and a natural gas network. In Section 3.1, we will give an overview of the underlying data, such as the total energy demand of the considered district. Furthermore, the technology portfolio is presented in Section 3.2. Finally, in Section 3.3, the results of the first case study are investigated.
The technology-specific CAPEX and OPEX costs are composed of heat generation technologies, DER, storages, the purchase of electricity and natural gas minus the revenues of feed-in by PV and CHP generation. Figure 2 provides an overview of several components that contribute to the total annual costs of the microgrid energy supply system.

Data Basis
In the following section, the preprocessing of the time series data for the microgrid optimization model is shown.
For the simulation of the heat load and PV generation profiles, the weather data of the test reference year (TRY) by the German Weather Service (DWD) for climate region 5 (Lower Rhine Westphalian Bay and Emsland) are used. The weather data is based on the period from 1988 to 2007.

Data Basis
In the following section, the preprocessing of the time series data for the microgrid optimization model is shown. For the simulation of the heat load and PV generation profiles, the weather data of the test reference year (TRY) by the German Weather Service (DWD) for climate region 5 (Lower Rhine Westphalian Bay and Emsland) are used. The weather data is based on the period from 1988 to 2007. Different TRY data are available for mean, extreme cold and warm weather years. This work considers the weather year that corresponds to the mean temperature.
A stochastic bottom-up simulation model implemented in Python is used to determine the domestic electrical demand. The model based on the CREST Demand Model [40][41][42] simulates the domestic electrical demand depending on the number of residents, the number of apartments and the individual user behavior of inhabitants. In addition to light bulbs, LED and halogen lamps were integrated into the model on the basis of statistical data for Germany [43]. The user behaviour of the inhabitants is determined by a probability distribution of the devices used in the households. The temporally resolved use of the devices is modelled by transition probabilities that are determined by first order Markov chains.
The domestic heating demand for buildings is determined with a 5R1C model that was modelled and implemented in Python as per Schütz et al. [21] and is based on the EN ISO 13799. The demand is simulated in a building depending on the outside temperature, solar radiation, wind and building parameters by optimizing the living space temperature between two limit temperatures (21 • C-24 • C). It is assumed that the domestic heating is only used during the typical North European heating period between 1st of October and 30th of April. The energy requirement for the supply of hot water is not considered.
The PV generation potential is site-specifically calculated with PVLIB Python [44] for the district. For the calculation of the PV generation profiles, it was assumed that all PV systems are constructed by modules of the type Hanwha HSL60P6-PA-4-250T, with a nominal power of 250 W. Each module is connected to an ABB MICRO-0.25-I-US inverter. The maximum installable capacity is determined depending on the available roof area. In Germany, roofs are often designed as saddle roofs, which means that their entire surface is usually distributed across two areas. The optimal installed PV capacity for a microgrid energy supply system is chosen by the microgrid optimization and is not part of the preprocessing. The load profiles and demand characteristics are presented in the appendix.

Technology Portfolio
The district energy demand as an input parameter of the microgrid optimization is divided into the domestic electricity demand and the heating demand. According to this, the microgrid energy supply system has a technology portfolio for the electricity and heat supply. The system is divided into heat generation, electricity generation, storage technologies and grids. Boilers, CHPs and heat pumps are options for fulfilling the heat demand of the buildings in the microgrid optimization. The electricity demand can be satisfied by CHPs and PV as DER, as well as by the purchase of electricity over the macrogrid. Storage technologies in the microgrid optimization include batteries and heat storage. The buildings in the microgrid optimization are connected with an electricity distribution grid and a natural gas network. The electricity grid allows an exchange of locally produced electricity across the buildings. The natural gas network is required to supply boilers and CHPs with natural gas.
The cost structure of the considered technology portfolio is based on actual (the year 2017) technologies and energy prices, as well as the regulatory framework for grid feed-in of electricity in Germany (see Tables 2 and 3). The grid feed-in is based on the regulatory framework of the German Renewable Energies Act (EEG) for PV and Combined Heat and Power Act (KWKG) for CHP. The electricity distribution grid and the natural gas network costs are not part of the optimization and it is assumed, that they are already in place.

Results of the First Case Study
The results of the microgrid optimization for the reference case show a microgrid energy supply system with electricity supply technologies, PV and CHP, as well as the heat supply technologies, boiler and CHP. The TAC of the considered microgrid energy supply system are 126,475 €/a.
The installed capacities of the technologies are based on the microgrid optimization results for the reference case and are shown in Figure 3. A CHP is only installed in building 1 (bd1). The district's largest heat storage is also built in building 1 to optimize the operation of the CHP. The second highest heating and electricity demand of all buildings in the district, as well as the peak electricity load in comparison to the other individual buildings, is located in building 1. Therefore, it is apparent that there is a correlation between the electricity peak load and the location of the CHP installation.
In buildings 2, 3, 4, 5 and 6 (bd2-bd6), the technology system design for the heat supply is similar. A natural gas boiler combined with a heat storage unit is chosen. A PV installation can be found in buildings 1, 2, 3 and 6. The reason for the missing PV in building 4 and 5 is the unfavorable roof orientation.
In Sections 3.3.1-3.3.4, the microgrid optimization will be executed with the time series aggregation approach, as well as the new 2-Level Approach, to compare the results with those from the reference case.

Results of the First Case Study
The results of the microgrid optimization for the reference case show a microgrid energy supply system with electricity supply technologies, PV and CHP, as well as the heat supply technologies, boiler and CHP. The TAC of the considered microgrid energy supply system are 126,475 €/a.
The installed capacities of the technologies are based on the microgrid optimization results for the reference case and are shown in Figure 3. A CHP is only installed in building 1 (bd1). The district's largest heat storage is also built in building 1 to optimize the operation of the CHP. The second highest heating and electricity demand of all buildings in the district, as well as the peak electricity load in comparison to the other individual buildings, is located in building 1. Therefore, it is apparent that there is a correlation between the electricity peak load and the location of the CHP installation.
In buildings 2, 3, 4, 5 and 6 (bd2-bd6), the technology system design for the heat supply is similar. A natural gas boiler combined with a heat storage unit is chosen. A PV installation can be found in buildings 1, 2, 3 and 6. The reason for the missing PV in building 4 and 5 is the unfavorable roof orientation.
In Section 3.3.1, Section 3.3.2, Section 3.3.3 and Section 3.3.4, the microgrid optimization will be executed with the time series aggregation approach, as well as the new 2-Level Approach, to compare the results with those from the reference case.

Investigation of Total Annual Costs
The TAC in the 2-Level Approach shows a low deviation of less than 0.1% related to the TAC of the reference case, as shown Figure 4. In the results of the time series aggregation, the deviation is higher in comparison to the 2-Level Approach for the analyzed different typical days. With an increasing number of typical days, the deviation of the time series aggregation related to the reference case decreases, but is still higher than in the 2-Level Approach. The 2-Level Approach shows a significantly lower deviation than the time series aggregation. For all typical periods, the 2-Level-Approach tends to slightly overestimate the TAC, whereas the time series aggregation tends to of typical days, the deviation of the time series aggregation related to the reference case decreases, but is still higher than in the 2-Level Approach. The 2-Level Approach shows a significantly lower deviation than the time series aggregation. For all typical periods, the 2-Level-Approach tends to slightly overestimate the TAC, whereas the time series aggregation tends to underestimate it. The reason for the underestimation of the TAC in the time series aggregation approach is the averaging effect caused by the clustering of time series data.

Investigation of Computing Time
The investigation of computing time runs on a Windows desktop PC with an Intel(R) Core(R) i7-6700K @4.00GHz and a Memory of 32 GB RAM.

Investigation of Computing Time
The investigation of computing time runs on a Windows desktop PC with an Intel(R) Core(R) i7-6700K @4.00GHz and a Memory of 32 GB RAM.      The computing time of the first level in the 2-Level Approach is equal to the time series aggregation because it follows the same approach. The computing time is 51 s for 5 typical days, 315 s for 10 typical days, 2017 s for 20 typical days and 14,754 s for 40 typical days. In addition to the time series aggregation in the first level of the 2-Level Approach, a full-time series optimization is running that results in an overall higher computing time for the 2-Level Approach than for the time series aggregation. The second level of the 2-Level Approach is an LP problem. Thus, the computing time changes only slightly with a minimum of 556 s and maximum of 597 s for solving the second level.

Investigation of the Supply Technologies
The microgrid optimization with full-time series (reference case), time series aggregation and the 2-Level Approach show the same technology structure. To meet the electricity and heating demand, the microgrid optimization opts for boilers, CHPs, heat storages and PV as supply technologies. The investigation of all installed supply technologies in the district is considered in the following paragraphs.
Boiler: The mean deviation of the 2-Level Approach related to the reference case is low, with a maximum of −4.46 kW th in 40 typical days and a minimum of −0.12 kW th in 5 typical days. In the time series aggregation, the mean deviation decreases from −51.8 kW th in 5 typical days to −17.57 kW th in 40 typical days. Thus, the mean results show a significantly lower deviation in the 2-Level Approach than in the time series aggregation. Figure 6 shows a positive deviation of installed boiler capacities in the case of 5, 10, 20 and 40 typical days for building 1 (bd 1) in the 2-Level Approach. The reason for this pattern is that the CHP in the reference case is installed in building 1, as shown in Figure 3, but in the 2-Level Approach, the CHP is installed in other buildings. A negative deviation of installed boiler capacities is observed in the buildings with additional CHP installation, as shown in Figure 6. Hence, there is a boiler capacity shift based on changing the CHP location. For example, in the 5 typical days case, the boiler capacities decrease in building 3, because the CHP is installed there and, on the other hand, the installed capacities increase in building 1 because of the missing CHP installation in comparison to the reference case.
Moreover, we can observe that the time series aggregation shows a similar pattern to that in the 2-Level-Approach with respect to the CHP-based capacity shifting. The installed capacities increase in building 1 and decrease in the building with the new CHP location. Additionally, there is a trend of high underestimation of boiler capacities in the buildings, which are not part of the changing CHP location. The underestimation of installed capacities decreases with an increasing number of typical days because of more detailed time series data. The microgrid optimization with full-time series (reference case), time series aggregation and the 2-Level Approach show the same technology structure. To meet the electricity and heating demand, the microgrid optimization opts for boilers, CHPs, heat storages and PV as supply technologies. The investigation of all installed supply technologies in the district is considered in the following paragraphs.
Boiler: The mean deviation of the 2-Level Approach related to the reference case is low, with a maximum of −4.46 kWth in 40 typical days and a minimum of −0.12 kWth in 5 typical days. In the time series aggregation, the mean deviation decreases from −51.8 kWth in 5 typical days to −17.57 kWth in 40 typical days. Thus, the mean results show a significantly lower deviation in the 2-Level Approach than in the time series aggregation. Figure 6 shows a positive deviation of installed boiler capacities in the case of 5, 10, 20 and 40 typical days for building 1 (bd 1) in the 2-Level Approach. The reason for this pattern is that the CHP in the reference case is installed in building 1, as shown in Figure 3, but in the 2-Level Approach, the CHP is installed in other buildings. A negative deviation of installed boiler capacities is observed in the buildings with additional CHP installation, as shown in Figure 6. Hence, there is a boiler capacity shift based on changing the CHP location. For example, in the 5 typical days case, the boiler capacities decrease in building 3, because the CHP is installed there and, on the other hand, the installed capacities increase in building 1 because of the missing CHP installation in comparison to the reference case.
Moreover, we can observe that the time series aggregation shows a similar pattern to that in the 2-Level-Approach with respect to the CHP-based capacity shifting. The installed capacities increase in building 1 and decrease in the building with the new CHP location. Additionally, there is a trend of high underestimation of boiler capacities in the buildings, which are not part of the changing CHP location. The underestimation of installed capacities decreases with an increasing number of typical days because of more detailed time series data. CHP: As discussed in the previous paragraph, the CHP installation location changes between the reference case and the 2-Level Approach, as well as the time series aggregation (see Figure 7). In the case of 5 typical days, the installed CHP shifts from building 1 to building 4, while in the case of 10 and 20 typical days, from building 1 to building 6, and in the case of 40 typical days, from building 1 to building 3.
The mean deviation of the installed CHP capacities computed with the 2-Level Approach is 0.02 kWth for all typical days. The optimization with the time series aggregation results in a minimal deviation of −0.02 kWth for 5 typical days and a maximum of 0.39 kWth for 40 typical days. A possible reason for the building shifting the CHP installation between the different typical days is that the difference in the TAC is not significant. Hence, its location does not affect the TAC. A further analysis is performed in order to investigate the TAC in relation to the location of the CHP in Section 3.3.5. CHP: As discussed in the previous paragraph, the CHP installation location changes between the reference case and the 2-Level Approach, as well as the time series aggregation (see Figure 7). In the case of 5 typical days, the installed CHP shifts from building 1 to building 4, while in the case of 10 and 20 typical days, from building 1 to building 6, and in the case of 40 typical days, from building 1 to building 3.
The mean deviation of the installed CHP capacities computed with the 2-Level Approach is 0.02 kW th for all typical days. The optimization with the time series aggregation results in a minimal deviation of −0.02 kW th for 5 typical days and a maximum of 0.39 kW th for 40 typical days. A possible reason for the building shifting the CHP installation between the different typical days is that the difference in the TAC is not significant. Hence, its location does not affect the TAC. A further analysis is performed in order to investigate the TAC in relation to the location of the CHP in Section 3.3.5. Heat Storage: Comparing the heat storage deviations to the reference case shows that the mean error of installed heat storage capacities changes with the increasing number of typical days, from 5.59 kWh to 21.44 kWh, on the back of the 2-Level Approach. The time series aggregation results exhibit the maximum mean deviation in 5 typical days of -32.98 kWh and the lowest mean deviation in the case of 10 typical days, with 0.57 kWh.
As described in the analysis of boilers and CHPs, the heat storage results also show a clear pattern of capacity shifting between the buildings on the different typical days, especially in the 2-Level Approach, as shown in Figure 8. In the case of 5 and 10 typical days, the installed heat storage capacities decrease in building 1 and increase in building 6, while in the case of 10 and 20 typical days, the installed capacities shift from building 1 to building 6. The possible reason for the same pattern of capacity shifting as in the CHP investigation is that the heat storage supports the operation of the CHP. The electricity generation costs with the CHP are, at 0.26 €/kWh, cheaper than purchasing electricity for 0.2985 €/kWh from the energy provider. Thus, the CHP is preferred by the optimizer to generate electricity, but the thermal energy of the CHP must be used to fulfil the heating demand or store heat in the storage, because an external chiller is not available for the CHP. Hence, the heat storage follows the CHP to buffer excess thermal energy.
The reason for the other typical day's high deviation of heat storage capacities in 40 typical days is a result of the underestimation of boiler capacities in building 3 for 40 typical days, which does not influence the TAC of the energy supply system.  Heat Storage: Comparing the heat storage deviations to the reference case shows that the mean error of installed heat storage capacities changes with the increasing number of typical days, from 5.59 kWh to 21.44 kWh, on the back of the 2-Level Approach. The time series aggregation results exhibit the maximum mean deviation in 5 typical days of -32.98 kWh and the lowest mean deviation in the case of 10 typical days, with 0.57 kWh.
As described in the analysis of boilers and CHPs, the heat storage results also show a clear pattern of capacity shifting between the buildings on the different typical days, especially in the 2-Level Approach, as shown in Figure 8. In the case of 5 and 10 typical days, the installed heat storage capacities decrease in building 1 and increase in building 6, while in the case of 10 and 20 typical days, the installed capacities shift from building 1 to building 6. The possible reason for the same pattern of capacity shifting as in the CHP investigation is that the heat storage supports the operation of the CHP. The electricity generation costs with the CHP are, at 0.26 €/kWh, cheaper than purchasing electricity for 0.2985 €/kWh from the energy provider. Thus, the CHP is preferred by the optimizer to generate electricity, but the thermal energy of the CHP must be used to fulfil the heating demand or store heat in the storage, because an external chiller is not available for the CHP. Hence, the heat storage follows the CHP to buffer excess thermal energy.
The reason for the other typical day's high deviation of heat storage capacities in 40 typical days is a result of the underestimation of boiler capacities in building 3 for 40 typical days, which does not influence the TAC of the energy supply system.
PV: As well as the 2-Level Approach, the time series aggregation meets the installed PV capacities in the reference case for the different investigated typical periods. Thus, both approaches represent the installed PV capacities very well with no deviation from the reference case. Heat Storage: Comparing the heat storage deviations to the reference case shows that the mean error of installed heat storage capacities changes with the increasing number of typical days, from 5.59 kWh to 21.44 kWh, on the back of the 2-Level Approach. The time series aggregation results exhibit the maximum mean deviation in 5 typical days of -32.98 kWh and the lowest mean deviation in the case of 10 typical days, with 0.57 kWh.
As described in the analysis of boilers and CHPs, the heat storage results also show a clear pattern of capacity shifting between the buildings on the different typical days, especially in the 2-Level Approach, as shown in Figure 8. In the case of 5 and 10 typical days, the installed heat storage capacities decrease in building 1 and increase in building 6, while in the case of 10 and 20 typical days, the installed capacities shift from building 1 to building 6. The possible reason for the same pattern of capacity shifting as in the CHP investigation is that the heat storage supports the operation of the CHP. The electricity generation costs with the CHP are, at 0.26 €/kWh, cheaper than purchasing electricity for 0.2985 €/kWh from the energy provider. Thus, the CHP is preferred by the optimizer to generate electricity, but the thermal energy of the CHP must be used to fulfil the heating demand or store heat in the storage, because an external chiller is not available for the CHP. Hence, the heat storage follows the CHP to buffer excess thermal energy.
The reason for the other typical day's high deviation of heat storage capacities in 40 typical days is a result of the underestimation of boiler capacities in building 3 for 40 typical days, which does not influence the TAC of the energy supply system.

Investigation of Peak Load
The peak load is the maximum electrical load, which flows over the grid assets. In general, a transformer in a microgrid optimization is the connection between a district and a higher grid level (e.g., between a low voltage and a medium voltage grid). The load flow is divided into two directions: The top-down load flow with the electricity from higher grid level flows via the transformer to fulfil the electricity demand in the district, whereas the bottom-up load flow is defined as the flow from the district (lower) to the higher grid level via the transformer. Bottom-up load flow occurs when the locally produced electricity (e.g., from PV, CHP) exceeds the local electricity demand in the district. The peak load is essential to analyzing the stress of the transformer. Furthermore, voltage issues may become relevant in real grids but are out of scope for this investigation. The peak load investigation is only applied to the electricity grid and not to the natural gas network.
The bottom-up peak load and the top-down peak load of the reference case is very well addressed with the 2-Level Approach, as shown in Figure 9. The time series aggregation underrepresents the peak load for different typical days. Only in the case of 20 typical days in the bottom-up peak load is the result of the time series aggregation close to the reference case. The result is probably based on a coincidence, as it is worse with 40 typical days.

Investigation of Peak Load
The peak load is the maximum electrical load, which flows over the grid assets. In general, a transformer in a microgrid optimization is the connection between a district and a higher grid level (e.g., between a low voltage and a medium voltage grid). The load flow is divided into two directions: The top-down load flow with the electricity from higher grid level flows via the transformer to fulfil the electricity demand in the district, whereas the bottom-up load flow is defined as the flow from the district (lower) to the higher grid level via the transformer. Bottom-up load flow occurs when the locally produced electricity (e.g., from PV, CHP) exceeds the local electricity demand in the district. The peak load is essential to analyzing the stress of the transformer. Furthermore, voltage issues may become relevant in real grids but are out of scope for this investigation. The peak load investigation is only applied to the electricity grid and not to the natural gas network.
The bottom-up peak load and the top-down peak load of the reference case is very well addressed with the 2-Level Approach, as shown in Figure 9. The time series aggregation underrepresents the peak load for different typical days. Only in the case of 20 typical days in the bottom-up peak load is the result of the time series aggregation close to the reference case. The result is probably based on a coincidence, as it is worse with 40 typical days.
The reason for the underrepresentation of the bottom-up load and top-down load in the time series aggregation is a missing data problem due to the clustering of time series. An increase of the typical days should tend to lead to a more accurate representation of peak periods with the time series aggregation because of the smaller cluster. Also, it is possible to add peak periods to the time series aggregation, which leads to a more accurate solution. On the other hand, adding peak periods leads to an increase in optimization complexity and increasing computing time due to additional time periods. Hence, the investigation shows that the 2-Level Approach is well suited to represent the bottom-up peak load and top-down peak load while the time series aggregation represents it insufficiently.

Impact Analysis of Fixed CHP Position
The investigation of CHPs in Section 3.3.3 showed that only one CHP was installed in one building in the analyzed microgrid energy supply system, but the building location of the CHP installation changes between the different typical days. In the reference case, the CHP was installed in building 1. In the 2-Level Approach, the CHP was installed in building 4 for 5 typical days, in building 6 for 10 and 20 typical days, respectively, and in building 3 for 40 typical days. To investigate the impact of the changing installed CHP location for different typical days, an analysis was performed. In this analysis, the buildings with the installed CHP were fixed one after another to The reason for the underrepresentation of the bottom-up load and top-down load in the time series aggregation is a missing data problem due to the clustering of time series. An increase of the typical days should tend to lead to a more accurate representation of peak periods with the time series aggregation because of the smaller cluster. Also, it is possible to add peak periods to the time series aggregation, which leads to a more accurate solution. On the other hand, adding peak periods leads to an increase in optimization complexity and increasing computing time due to additional time periods. Hence, the investigation shows that the 2-Level Approach is well suited to represent the bottom-up peak load and top-down peak load while the time series aggregation represents it insufficiently.

Impact Analysis of Fixed CHP Position
The investigation of CHPs in Section 3.3.3 showed that only one CHP was installed in one building in the analyzed microgrid energy supply system, but the building location of the CHP installation changes between the different typical days. In the reference case, the CHP was installed in building 1. In the 2-Level Approach, the CHP was installed in building 4 for 5 typical days, in building 6 for 10 and 20 typical days, respectively, and in building 3 for 40 typical days. To investigate the impact of the changing installed CHP location for different typical days, an analysis was performed. In this analysis, the buildings with the installed CHP were fixed one after another to investigate the deviation of TAC related to the reference case. The results of the analysis are shown in Table 4. Impact of different fix CHP locations on the TAC. On the one hand, the table shows the location of the fixed installed CHP  The analysis shows that the deviation of the TAC for different fixed CHP locations is low compared to the reference case with 0.013% to 0.151%. The lowest deviation of 0.013% can be reached if the CHP is fixed in building 1, which corresponds to the location of the reference case. The highest deviation of TAC is identifiable for a fixed CHP location in building 2, with 0.151% and 10 typical days. However, in general, all results with the fixed location of the CHP show low deviations compared to the reference case. Therefore, the impact of the different placements of CHPs on the TAC is low.
The next pattern of CHP capacity shifting is shown for different fixed CHP locations. The pattern was seen for 5, 10, 20 and 40 typical days. The arrows in Figure 10 represent the trend of installed capacity shifting for the technologies of boilers and heat storage. A dash indicates no changing of installed capacities in comparison to the reference case.
If the CHP is installed in building 1, which is the location of the reference case, there is no changing with respect to the installed capacities for boilers and heat storage. In the other cases of a fixed CHP location, there is a clear pattern that boiler capacities decrease and heat storage capacities increase in the building of the fixed CHP installation. Furthermore, in building 1, the boiler capacities increase, and the heat storage capacities decrease. The reason for increasing boiler capacities in building 1 is that the thermal capacities of the CHP are missing.  Table 4. Impact of different fix CHP locations on the TAC.. On the one hand, the table shows the location of the fixed installed CHP location in columns and, on the other hand, the deviation of the TAC related to the reference case for 5, 10, 20 and 40 typical days. The analysis shows that the deviation of the TAC for different fixed CHP locations is low compared to the reference case with 0.013% to 0.151%. The lowest deviation of 0.013% can be reached if the CHP is fixed in building 1, which corresponds to the location of the reference case. The highest deviation of TAC is identifiable for a fixed CHP location in building 2, with 0.151% and 10 typical days. However, in general, all results with the fixed location of the CHP show low deviations compared to the reference case. Therefore, the impact of the different placements of CHPs on the TAC is low. The next pattern of CHP capacity shifting is shown for different fixed CHP locations. The pattern was seen for 5, 10, 20 and 40 typical days. The arrows in Figure 10 represent the trend of installed capacity shifting for the technologies of boilers and heat storage. A dash indicates no changing of installed capacities in comparison to the reference case.
If the CHP is installed in building 1, which is the location of the reference case, there is no changing with respect to the installed capacities for boilers and heat storage. In the other cases of a fixed CHP location, there is a clear pattern that boiler capacities decrease and heat storage capacities increase in the building of the fixed CHP installation. Furthermore, in building 1, the boiler capacities increase, and the heat storage capacities decrease. The reason for increasing boiler capacities in building 1 is that the thermal capacities of the CHP are missing.

Second Case Study-Island System
The second study focuses on the application of the presented 2-Level Approach on a simple single-node island system with two commodities, namely electricity and hydrogen. The island system consists of a wind farm, photovoltaics and a small backup plant as the energy supply, and a single electricity demand. Moreover, two alternative energy storage technologies are included: The first stores electricity directly using batteries, while the second uses electrolyzers and fuel cells to store the surplus energy from the electrical grid in hydrogen pressure vessels [60] using electrolysis.

Data Basis
The regional electricity demand is the ENTSO-E profile of Germany in 2013 normalized to a peak load of 1 MW and is, as well as the wind feed-in, drawn from Robinius et al. [61]. This is only an indication for the purpose of illustration, since real grids of about 1 MW do not have industrial base loads in their profile. The potential photovoltaic feed-in is simulated with PV-Lib [62]. The power plant's total feed-in is restricted to 10% of the total energy demand in order to maintain a high share of renewable energy supply. As it has been shown by other authors [13], the CO 2 reduction costs grow exponentially with decreasing CO 2 emissions. In order to avoid unrealistic surplus capacities of the renewable energy and storage components, and in order to make the hydrogen storages a competitive solution in that system, a maximum total feed-in of 10% by the power plant turned out to be an appropriate percentage. Moreover, this highlights the influence of the presented method on the storage dimensioning if a non-sufficient number of typical days is chosen in the first level. The system is shown in Figure 11. Moreover, the modeling of seasonal storages as proposed by Kotzur et al. [37] is also taken into account, which is especially important for the hydrogen storage.
plant's total feed-in is restricted to 10% of the total energy demand in order to maintain a high share of renewable energy supply. As it has been shown by other authors [13], the CO2 reduction costs grow exponentially with decreasing CO2 emissions. In order to avoid unrealistic surplus capacities of the renewable energy and storage components, and in order to make the hydrogen storages a competitive solution in that system, a maximum total feed-in of 10% by the power plant turned out to be an appropriate percentage. Moreover, this highlights the influence of the presented method on the storage dimensioning if a non-sufficient number of typical days is chosen in the first level. The system is shown in Figure 11. Moreover, the modeling of seasonal storages as proposed by Kotzur et al. [37] is also taken into account, which is especially important for the hydrogen storage.

Technology Portfolio
The detailed input parameters used for modeling the island system model can be taken from Table 5 and are derived from Kotzur et al. [37] with an interest rate of 4% per year for each component for consistency with the above-presented case study. It is worth mentioning that only the wind farm, the photovoltaics, the electrolyzer and the fuel cell are modeled with binary variables according to a certain starting investment (named CAPEXFix in the tables) and fixed operation costs (OPEXFix) depending on the decision as to whether these units are chosen (1) or not (0). In contrast to that, the backup plant, the battery and the hydrogen storage are modeled linearly, since their overall costs only depend on their consumed commodity (gas with 20 ct/kWh in case of the backup plant) and their capacities, respectively. This fairly simple layout is chosen to emphasize the big impact of the proposed 2-Level Approach on the computing time even for small systems while maintaining good results. Additionally, a single-node model with a high share of renewable energy was chosen to highlight the impact of the proposed method on seasonal storages while neglecting balancing effects of multi-regional distribution grid modeling. Figure 11. Technology portfolio of the island system. Figure 11. Technology portfolio of the island system.

Technology Portfolio
The detailed input parameters used for modeling the island system model can be taken from Table 5 and are derived from Kotzur et al. [37] with an interest rate of 4% per year for each component for consistency with the above-presented case study. It is worth mentioning that only the wind farm, the photovoltaics, the electrolyzer and the fuel cell are modeled with binary variables according to a certain starting investment (named CAPEX Fix in the tables) and fixed operation costs (OPEX Fix ) depending on the decision as to whether these units are chosen (1) or not (0). In contrast to that, the backup plant, the battery and the hydrogen storage are modeled linearly, since their overall costs only depend on their consumed commodity (gas with 20 ct/kWh in case of the backup plant) and their capacities, respectively. This fairly simple layout is chosen to emphasize the big impact of the proposed 2-Level Approach on the computing time even for small systems while maintaining good results. Additionally, a single-node model with a high share of renewable energy was chosen to highlight the impact of the proposed method on seasonal storages while neglecting balancing effects of multi-regional distribution grid modeling.

Results of the Second Case Study
The following section focuses on the optimization results of the island system presented above. The following sections first investigate the total annual cost as the actual objective function of the optimization, the computing times to highlight the benefits of the 2-Level Approach followed by a detailed analysis of the component's capacities within the different levels of the proposed approach.

Investigation of Total Annual Costs
The deviation of the total annual costs depending on the number of typical days and the level in which they are determined by the optimization, in comparison to the reference case that uses the full time series, as well as all four binaries as variables, is illustrated in Figure 12.

Results of the Second Case Study
The following section focuses on the optimization results of the island system presented above. The following sections first investigate the total annual cost as the actual objective function of the optimization, the computing times to highlight the benefits of the 2-Level Approach followed by a detailed analysis of the component's capacities within the different levels of the proposed approach.

Investigation of Total Annual Costs
The deviation of the total annual costs depending on the number of typical days and the level in which they are determined by the optimization, in comparison to the reference case that uses the full time series, as well as all four binaries as variables, is illustrated in Figure 12. As mentioned above, optimization based on 5 or 10 typical days using hierarchical clustering of the daily duration curves leads to a negligence of the hydrogen technologies that ultimately results in higher total annual costs for the whole system after the second optimization. This can be attributed by the fact that the clustering algorithm has a big impact on the smoothness and variance of the clustered input data. In the case of the first level of the optimization using 5 or 10 typical days, this leads to an underestimation of the total variance of the input time series, which means that the time series is considered to be highly repetitive and the overall variance is not kept which has an impact on the design of surplus capacities. Since the fuel cell and the electrolyzer have a start investment of 100,000 € each before building any capacities at all, the optimization based on the aggregated time series turns out to be more profitable if the hydrogen storage technology is not chosen. Because of As mentioned above, optimization based on 5 or 10 typical days using hierarchical clustering of the daily duration curves leads to a negligence of the hydrogen technologies that ultimately results in higher total annual costs for the whole system after the second optimization. This can be attributed by the fact that the clustering algorithm has a big impact on the smoothness and variance of the clustered input data. In the case of the first level of the optimization using 5 or 10 typical days, this leads to an underestimation of the total variance of the input time series, which means that the time series is considered to be highly repetitive and the overall variance is not kept which has an impact on the design of surplus capacities. Since the fuel cell and the electrolyzer have a start investment of 100,000 € each before building any capacities at all, the optimization based on the aggregated time series turns out to be more profitable if the hydrogen storage technology is not chosen. Because of the fact that the binary variables from the first level of optimization are used as input parameters for the second level of optimization, the hydrogen technologies are not implemented as well when repeating the optimization with the full time series. This turns out not to be a sufficient assumption, since the overall costs of the energy system based on batteries as the only storage technology exceeds the total annual costs of the reference case by approximately 7%. 20 typical periods seem to be sufficient to find the right combination of binary variables, which means that all technologies with binary variables, namely wind, PV and hydrogen technologies, are chosen according to the reference case. This reduces the deviation of the total annual costs compared to the reference case to zero after the second level of optimization. This can also be observed for 40 typical days. However, it needs to be highlighted that in this case the deviation of the total annual costs after the first level of optimization is already far below 1%, which raises the question as to whether the second level of optimization is necessary at all in this case. Last but not least, it needs to be stated that the number of clusters for finding a good set of binary variables is a highly non-trivial question and that it is not guaranteed that the set of binary variables found in the first level always remains the same when choosing more than 20 typical days. However, the chance of finding the cost-optimal set of binary variables becomes more and more likely when increasing the number of typical days.

Investigation of Computing Time
The investigation of the total computing time is illustrated in Figure 13. Here, the computing time for the first level of optimization, labelled by "Time Series Aggregation", and the total computing time for the 2-Level Approach including the optimization time for the first and second level, but excluding the calculation time for the clustering, which turned out to be negligibly (<0 s) small, are compared with the reference calculation time. For the optimization of the island system, an Intel Core i7 processor was used with eight cores, 16 GB of RAM and a CPU at 3.04 GHz was used. Gurobi 8.1.0 was executed using 6 threads to solve the problem. the fact that the binary variables from the first level of optimization are used as input parameters for the second level of optimization, the hydrogen technologies are not implemented as well when repeating the optimization with the full time series. This turns out not to be a sufficient assumption, since the overall costs of the energy system based on batteries as the only storage technology exceeds the total annual costs of the reference case by approximately 7%. 20 typical periods seem to be sufficient to find the right combination of binary variables, which means that all technologies with binary variables, namely wind, PV and hydrogen technologies, are chosen according to the reference case. This reduces the deviation of the total annual costs compared to the reference case to zero after the second level of optimization. This can also be observed for 40 typical days. However, it needs to be highlighted that in this case the deviation of the total annual costs after the first level of optimization is already far below 1%, which raises the question as to whether the second level of optimization is necessary at all in this case. Last but not least, it needs to be stated that the number of clusters for finding a good set of binary variables is a highly non-trivial question and that it is not guaranteed that the set of binary variables found in the first level always remains the same when choosing more than 20 typical days. However, the chance of finding the costoptimal set of binary variables becomes more and more likely when increasing the number of typical days.

Investigation of Computing Time
The investigation of the total computing time is illustrated in Figure 13. Here, the computing time for the first level of optimization, labelled by "Time Series Aggregation", and the total computing time for the 2-Level Approach including the optimization time for the first and second level, but excluding the calculation time for the clustering, which turned out to be negligibly (<0 s) small, are compared with the reference calculation time. For the optimization of the island system, an Intel Core i7 processor was used with eight cores, 16 GB of RAM and a CPU at 3.04 GHz was used. Gurobi 8.1.0 was executed using 6 threads to solve the problem. First of all, a monotonic rise of the computing time can be observed for an increase of typical days during the first level of the optimization which is trivial because of the growing number of input variables when representing a long time series by more and more representative time steps. However, the overall computing time of the 2-Level Approach depends, in our case, more strongly on the second optimization which becomes more demanding when the hydrogen technologies are included for 20 or 40 typical days, which is crucial for achieving the cost-optimal solution. In these cases, the calculation time for the reference case with 271 s is still 7.2 and 5.3 times bigger than the calculation times for 20 typical days with 38 s and 40 typical days with 51 s respectively. However, when comparing this result in the calculation time of the first level of optimization using 40 typical First of all, a monotonic rise of the computing time can be observed for an increase of typical days during the first level of the optimization which is trivial because of the growing number of input variables when representing a long time series by more and more representative time steps. However, the overall computing time of the 2-Level Approach depends, in our case, more strongly on the second optimization which becomes more demanding when the hydrogen technologies are included for 20 or 40 typical days, which is crucial for achieving the cost-optimal solution. In these cases, the calculation time for the reference case with 271 s is still 7.2 and 5.3 times bigger than the calculation times for 20 typical days with 38 s and 40 typical days with 51 s respectively. However, when comparing this result in the calculation time of the first level of optimization using 40 typical days with a final deviation of the total annual cost with well below 1%, it becomes obvious that it outperforms all other methods, with 23 s, by more than 35%. Therefore, it is highly important to predefine the targets of the optimization before using the proposed 2-Level Approach, since a simple clustering method with a sufficient number of clusters might be more adequate in some applications.

Investigation of the Different Technology Capacities
Photovoltaic: The photovoltaic capacities depending on the choice of the number of typical days and the different levels are shown in Figure 14. Here, the dotted line represents the capacity of the reference case. days with a final deviation of the total annual cost with well below 1%, it becomes obvious that it outperforms all other methods, with 23 s, by more than 35%. Therefore, it is highly important to predefine the targets of the optimization before using the proposed 2-Level Approach, since a simple clustering method with a sufficient number of clusters might be more adequate in some applications.

Investigation of the Different Technology Capacities
Photovoltaic: The photovoltaic capacities depending on the choice of the number of typical days and the different levels are shown in Figure 14. Here, the dotted line represents the capacity of the reference case. As was discussed above, the representation of the time series by 5 or 10 typical days leads to the negligence of the hydrogen technologies and simultaneously to an overestimation of the redundancy of the time series which means that the clustered time series have a too regular pattern to come close to the real cost-optimal solution. For photovoltaics, this leads to an overestimation of the capacities, especially in the first level of the optimization. This effect is overcome when repeating the calculation with the full time series. However, a higher share of photovoltaics is still needed to compensate the lack of hydrogen storage. For 20 or 40 typical days both optimizations already approach the reference case in the first level and the results are identical for the second case.
Wind energy: The wind energy capacities depending on the choice of the number of typical days and the different levels are shown in Figure 15.  As was discussed above, the representation of the time series by 5 or 10 typical days leads to the negligence of the hydrogen technologies and simultaneously to an overestimation of the redundancy of the time series which means that the clustered time series have a too regular pattern to come close to the real cost-optimal solution. For photovoltaics, this leads to an overestimation of the capacities, especially in the first level of the optimization. This effect is overcome when repeating the calculation with the full time series. However, a higher share of photovoltaics is still needed to compensate the lack of hydrogen storage. For 20 or 40 typical days both optimizations already approach the reference case in the first level and the results are identical for the second case.
Wind energy: The wind energy capacities depending on the choice of the number of typical days and the different levels are shown in Figure 15. days with a final deviation of the total annual cost with well below 1%, it becomes obvious that it outperforms all other methods, with 23 s, by more than 35%. Therefore, it is highly important to predefine the targets of the optimization before using the proposed 2-Level Approach, since a simple clustering method with a sufficient number of clusters might be more adequate in some applications.

Investigation of the Different Technology Capacities
Photovoltaic: The photovoltaic capacities depending on the choice of the number of typical days and the different levels are shown in Figure 14. Here, the dotted line represents the capacity of the reference case. As was discussed above, the representation of the time series by 5 or 10 typical days leads to the negligence of the hydrogen technologies and simultaneously to an overestimation of the redundancy of the time series which means that the clustered time series have a too regular pattern to come close to the real cost-optimal solution. For photovoltaics, this leads to an overestimation of the capacities, especially in the first level of the optimization. This effect is overcome when repeating the calculation with the full time series. However, a higher share of photovoltaics is still needed to compensate the lack of hydrogen storage. For 20 or 40 typical days both optimizations already approach the reference case in the first level and the results are identical for the second case.
Wind energy: The wind energy capacities depending on the choice of the number of typical days and the different levels are shown in Figure 15.  In contrast to the photovoltaic capacities, the wind energy capacities are underestimated in the first level for 5 or 10 typical days. However, when repeating the optimization with the full time series, the capacities are overestimated for 10 typical days and in contrast to the photovoltaics, the wind energy is more favored when using the full time series. This is due to the fact that wind profiles do not have a strong daily pattern and do not strongly correlate with photovoltaics when they are not clustered together with a small number of typical days. For 20 or 40 typical days, the 2-Level Approach meets the results of the reference case. This means that a higher number of clusters do not necessarily improve the clustering process itself if the values are sorted before on a daily basis if they do not have a daily pattern.
Backup plant: The backup plant capacities, depending on the choice of the number of typical days and the different levels, are shown in Figure 16. In contrast to the photovoltaic capacities, the wind energy capacities are underestimated in the first level for 5 or 10 typical days. However, when repeating the optimization with the full time series, the capacities are overestimated for 10 typical days and in contrast to the photovoltaics, the wind energy is more favored when using the full time series. This is due to the fact that wind profiles do not have a strong daily pattern and do not strongly correlate with photovoltaics when they are not clustered together with a small number of typical days. For 20 or 40 typical days, the 2-Level Approach meets the results of the reference case. This means that a higher number of clusters do not necessarily improve the clustering process itself if the values are sorted before on a daily basis if they do not have a daily pattern.
Backup plant: The backup plant capacities, depending on the choice of the number of typical days and the different levels, are shown in Figure 16. Taking into account that the yearly amount of electricity produced by the backup plant is limited to 10% of the overall electricity production, it becomes clear that clustering with too few typical days leads to an underestimation of the capacities because extreme periods are widely neglected and the backup plant is working on a more regular basis with a lower peak load. This is the case for 5 typical days. However, since medoids are chosen as representatives in the used hierarchical clustering algorithm, this is not a strict rule, as it can be observed for 10 and 20 typical days. Concerning the second level of optimization, the missing hydrogen technologies for 5 and 10 typical days lead to a remarkable higher capacity of the backup plant, even in comparison to the reference case. Firstly, the full time series takes all extreme periods into account and secondly the residual loads that cannot be met by wind energy, PV and the battery are considerably higher.
Hydrogen technologies: The capacities of the hydrogen technologies are shown in Figure 17, Figure  18 and Figure 19. Since they are an additional supply line that is used for temporarily storing the energy from the electricity grid only, it is clear that all three units correlate with each other. Since the electrolyzer and the fuel cell have a fixed investment each in contrast to the battery, the hydrogen technologies are not built for 5 and 10 typical days due to the averaging effect of clustering, but are overestimated for 20 and 40 typical days due to the small price per capacity. However, the greater the number of typical days, the more the capacities converge to the reference case. Since the setup of the binary variables for 20 and 40 typical days is identical to the reference case, the 2-Level Approach leads to identical solutions in these cases. Taking into account that the yearly amount of electricity produced by the backup plant is limited to 10% of the overall electricity production, it becomes clear that clustering with too few typical days leads to an underestimation of the capacities because extreme periods are widely neglected and the backup plant is working on a more regular basis with a lower peak load. This is the case for 5 typical days. However, since medoids are chosen as representatives in the used hierarchical clustering algorithm, this is not a strict rule, as it can be observed for 10 and 20 typical days. Concerning the second level of optimization, the missing hydrogen technologies for 5 and 10 typical days lead to a remarkable higher capacity of the backup plant, even in comparison to the reference case. Firstly, the full time series takes all extreme periods into account and secondly the residual loads that cannot be met by wind energy, PV and the battery are considerably higher.
Hydrogen technologies: The capacities of the hydrogen technologies are shown in Figures 17-19. Since they are an additional supply line that is used for temporarily storing the energy from the electricity grid only, it is clear that all three units correlate with each other. Since the electrolyzer and the fuel cell have a fixed investment each in contrast to the battery, the hydrogen technologies are not built for 5 and 10 typical days due to the averaging effect of clustering, but are overestimated for 20 and 40 typical days due to the small price per capacity. However, the greater the number of typical days, the more the capacities converge to the reference case. Since the setup of the binary variables for 20 and 40 typical days is identical to the reference case, the 2-Level Approach leads to identical solutions in these cases.   Battery: Figure 20 shows the battery capacities of this study, which turn out to be significantly smaller than the hydrogen capacities by one magnitude if the hydrogen technologies are chosen to be built.   Battery: Figure 20 shows the battery capacities of this study, which turn out to be significantly smaller than the hydrogen capacities by one magnitude if the hydrogen technologies are chosen to be built.   Battery: Figure 20 shows the battery capacities of this study, which turn out to be significantly smaller than the hydrogen capacities by one magnitude if the hydrogen technologies are chosen to be built. Battery: Figure 20 shows the battery capacities of this study, which turn out to be significantly smaller than the hydrogen capacities by one magnitude if the hydrogen technologies are chosen to be built. For 5 and 10 typical days used for the initial optimization, an overestimation of the battery capacities can be observed, since the hydrogen technologies are not chosen to store energy from the electricity grid. The reason for that can be derived when comparing the capacities in the first level to those in the second level: Because of the negligence of extreme periods when using few typical days and because of the overestimation of the regularity of inter-daily patterns, smaller storage capacities seem to be sufficient compared to the optimizations that use the full time series. This rule also applies for the cases when the hydrogen technologies are chosen to be built. In these cases, the capacities for the battery storage using typical days are smaller than in the reference case, whereas the reference capacities are met after the second level of optimization.

Investigation of the Connection between the Storages
Although the amount of stored energy in the hydrogen storage and the battery storage depends on the efficiencies of the fuel cell and the discharge efficiency of the battery, it can be stated that the amount of energy that is stored in the hydrogen storage exceeds that of the battery by about one magnitude. However, it can be observed that both storages correlate in a negative manner with respect to their deviations from the reference case within the first level of optimization. When the battery storage is underestimated, the hydrogen storage is overestimated, and vice versa. In addition, the battery storage works on a daily basis, while the hydrogen storage stores energy for longer periods of time, which is illustrated in the color plots in the appendix ( Figure A1 and Figure ).
Last but not least, it has to be mentioned that the optimization using the 2-Level Approach results in the same total annual cost minimum as the reference case but in a slightly different operation of the different units. This is also shown in Figure 21. As can be seen, the black line representing the reference case slightly differs from the output of the 2-Level Approach. However, this is a typical result for energy system optimizations with several alternative technologies since it results in a feasible region that is flat and sometimes even indifferent towards different solutions in its optimum. For 5 and 10 typical days used for the initial optimization, an overestimation of the battery capacities can be observed, since the hydrogen technologies are not chosen to store energy from the electricity grid. The reason for that can be derived when comparing the capacities in the first level to those in the second level: Because of the negligence of extreme periods when using few typical days and because of the overestimation of the regularity of inter-daily patterns, smaller storage capacities seem to be sufficient compared to the optimizations that use the full time series. This rule also applies for the cases when the hydrogen technologies are chosen to be built. In these cases, the capacities for the battery storage using typical days are smaller than in the reference case, whereas the reference capacities are met after the second level of optimization.

Investigation of the Connection between the Storages
Although the amount of stored energy in the hydrogen storage and the battery storage depends on the efficiencies of the fuel cell and the discharge efficiency of the battery, it can be stated that the amount of energy that is stored in the hydrogen storage exceeds that of the battery by about one magnitude. However, it can be observed that both storages correlate in a negative manner with respect to their deviations from the reference case within the first level of optimization. When the battery storage is underestimated, the hydrogen storage is overestimated, and vice versa. In addition, the battery storage works on a daily basis, while the hydrogen storage stores energy for longer periods of time, which is illustrated in the color plots in the Appendix A ( Figures A1 and A2).
Last but not least, it has to be mentioned that the optimization using the 2-Level Approach results in the same total annual cost minimum as the reference case but in a slightly different operation of the different units. This is also shown in Figure 21. For 5 and 10 typical days used for the initial optimization, an overestimation of the battery capacities can be observed, since the hydrogen technologies are not chosen to store energy from the electricity grid. The reason for that can be derived when comparing the capacities in the first level to those in the second level: Because of the negligence of extreme periods when using few typical days and because of the overestimation of the regularity of inter-daily patterns, smaller storage capacities seem to be sufficient compared to the optimizations that use the full time series. This rule also applies for the cases when the hydrogen technologies are chosen to be built. In these cases, the capacities for the battery storage using typical days are smaller than in the reference case, whereas the reference capacities are met after the second level of optimization.

Investigation of the Connection between the Storages
Although the amount of stored energy in the hydrogen storage and the battery storage depends on the efficiencies of the fuel cell and the discharge efficiency of the battery, it can be stated that the amount of energy that is stored in the hydrogen storage exceeds that of the battery by about one magnitude. However, it can be observed that both storages correlate in a negative manner with respect to their deviations from the reference case within the first level of optimization. When the battery storage is underestimated, the hydrogen storage is overestimated, and vice versa. In addition, the battery storage works on a daily basis, while the hydrogen storage stores energy for longer periods of time, which is illustrated in the color plots in the appendix ( Figure A1 and Figure ).
Last but not least, it has to be mentioned that the optimization using the 2-Level Approach results in the same total annual cost minimum as the reference case but in a slightly different operation of the different units. This is also shown in Figure 21. As can be seen, the black line representing the reference case slightly differs from the output of the 2-Level Approach. However, this is a typical result for energy system optimizations with several alternative technologies since it results in a feasible region that is flat and sometimes even indifferent towards different solutions in its optimum. As can be seen, the black line representing the reference case slightly differs from the output of the 2-Level Approach. However, this is a typical result for energy system optimizations with several alternative technologies since it results in a feasible region that is flat and sometimes even indifferent towards different solutions in its optimum.

Summary and Conclusions
MILP microgrid optimization models have a high degree of complexity and thus the requisite computing time is a limitation for solving multi-node systems. To reduce the complexity of the models, techniques like spatial or temporal aggregation can be used. Time series aggregation is a common approach to decrease the temporal resolution by clustering typical time periods. The problem with time series aggregation is the underestimation of the optimization results and the lack of representation time series, e.g., underestimation of the peak load.
This paper investigates a new 2-Level Approach based on time series aggregation to decrease the optimization complexity compared to a full optimization and increase the accuracy compared to a design only based on aggregated time series. The investigation was exemplarily performed for a MILP microgrid energy supply system with six multifamily houses, as well as for a hypothetical island system with a high share of renewable energy, which shows that a transfer of the approach to other energy systems is conceivable.
The results of the first case study show that the 2-Level Approach represents the installed technology capacities accurately, as well as the top-down and bottom-up peak load periods, compared to the reference case. The computing time is lower for the 2-Level Approach compared to the reference case with a maximal decrease of −99.72% for 5 typical days and a minimal decrease of -92.96% for 40 typical days. The TAC is represented very well with the 2-Level Approach, with a deviation of a maximum of 0.08% compared to the reference case. The location of the installed technologies changes between the buildings inside the district for different typical days in the 2-Level Approach, but does not affect the TAC. Thus, a high sensitivity related to the technology placement can be assumed. The aggregated installed capacities over all the buildings in the district change slightly in the 2-Level Approach, while the time series aggregation shows a significantly higher deviation, especially for the installed boiler capacities.
For the second case study, different results with respect to the impact of the binary variables can be observed, but the overall benefit in computing time of the 2-Level Approach remains. Firstly, the island system is smaller and less complex and has only four binary variables. Yet, their influence is significant: for 5 and 10 typical days, the hydrogen technologies for storing energy from the electricity grid are neglected, which results in an overestimation of the TAC of approximately 7% compared to the reference case. However, the decrease in the calculation time is 98.82% and 97.75%, respectively. For 20 and 40 typical days, the set of binaries is identical to the one in the reference case, which reduces the deviation of the TAC to 0%. However, the decrease in computing time is reduced to 86.06% and 81.27%, respectively. This is why 20 typical days is the most appropriate number for optimizing this energy system (second case study) in an exact but computationally tractable manner. However, the strong impact of the number of typical days that must be chosen in the first level to lead to sufficient binary variables among rival storage technologies is also shown by this case study.
Finally, the comparison of both case studies illustrates that the sensitivity of the component selection (i.e., the binaries set) depends on the number of components and their financial similarity: The more similar two components are in terms of their total annual costs and operational behavior, the less the TAC depends on either design. This means that a higher number of typical periods must be chosen to achieve the same energy system design as in the reference case.
In summary, the results of this study show a high degree of accuracy for the 2-Level Approach. Hence, it is possible to optimize large, multi-node energy systems, in which optimization with the full-time series would not be applicable. With the 2-Level Approach, an investigation of a detailed technology operation, including long-term storage, is achievable with high accuracy and acceptable computing time. Furthermore, e.g., the transformer load of a district can be analyzed for different scenarios, which is important for distribution grid network planning. The question of a sufficient time series aggregation for generating a good initial solution remains a task for future research, as well as the application of the proposed method on MILPs that contain piecewise linear functions with more than one binary variable per component. Funding: The first author would like to thank Westnetz GmbH for their cooperation and funding. Additionally, the Helmholtz Association under the Joint Initiative "EnergySystem 2050-A Contribution of the Research Field Energy" supported this work. In addition, the authors acknowledge the financial support by the Federal Ministry for Economic Affairs and Energy of Germany in the project METIS (project number 03ET4064A).

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A Load Profiletable
The aggregated electricity and heating demand profiles are shown in Figures A3 and A4. For instance, the month of January is presented for electricity and heating demand in Figures A5 and A6. Figures A1 and A2 show the state of charge of the battery storage and the hydrogen storage for 40 typical days. This reveals that the framework is capable of taking seasonal storage into account, which is the case for the hydrogen storage, whereas the optimized battery stores energy on a daily basis.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A: Load Profiletable
The aggregated electricity and heating demand profiles are shown in Figure A5 Figure and Figure . For instance, the month of January is presented for electricity and heating demand in Figures  A5 and A6. Figures A1 and A2 show the state of charge of the battery storage and the hydrogen storage for 40 typical days. This reveals that the framework is capable of taking seasonal storage into account, which is the case for the hydrogen storage, whereas the optimized battery stores energy on a daily basis.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A: Load Profiletable
The aggregated electricity and heating demand profiles are shown in Figure A5 Figure and Figure . For instance, the month of January is presented for electricity and heating demand in Figures  A5 and A6. Figures A1 and A2 show the state of charge of the battery storage and the hydrogen storage for 40 typical days. This reveals that the framework is capable of taking seasonal storage into account, which is the case for the hydrogen storage, whereas the optimized battery stores energy on a daily basis.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A: Load Profiletable
The aggregated electricity and heating demand profiles are shown in Figure A5 Figure and Figure . For instance, the month of January is presented for electricity and heating demand in Figures  A5 and A6. Figures A1 and A2 show the state of charge of the battery storage and the hydrogen storage for 40 typical days. This reveals that the framework is capable of taking seasonal storage into account, which is the case for the hydrogen storage, whereas the optimized battery stores energy on a daily basis.

Appendix B: Demand Characterization
An overview of the total demands and the PV generation potentials as a result of the preprocessing is given in Table . The bottom-up approach in the load profile simulation leads to different total electricity and heating demands for the individual buildings (multi-family houses). The highest electricity demand in the district is shown in building 3, with 24,892 kWh, while the lowest is in building 4, at 17,439 kWh. The highest peak load of individual buildings is in building 1, with 22.97 kWh and, after that, 22.05 kWh in building 3.

Appendix B: Demand Characterization
An overview of the total demands and the PV generation potentials as a result of the preprocessing is given in Table . The bottom-up approach in the load profile simulation leads to different total electricity and heating demands for the individual buildings (multi-family houses). The highest electricity demand in the district is shown in building 3, with 24,892 kWh, while the lowest is in building 4, at 17,439 kWh. The highest peak load of individual buildings is in building 1, with 22.97 kWh and, after that, 22.05 kWh in building 3.

Appendix B: Demand Characterization
An overview of the total demands and the PV generation potentials as a result of the preprocessing is given in Table . The bottom-up approach in the load profile simulation leads to different total electricity and heating demands for the individual buildings (multi-family houses). The highest electricity demand in the district is shown in building 3, with 24,892 kWh, while the lowest is in building 4, at 17,439 kWh. The highest peak load of individual buildings is in building 1, with 22.97 kWh and, after that, 22.05 kWh in building 3.

Appendix B Demand Characterization
An overview of the total demands and the PV generation potentials as a result of the preprocessing is given in Table A1. The bottom-up approach in the load profile simulation leads to different total electricity and heating demands for the individual buildings (multi-family houses). The highest electricity demand in the district is shown in building 3, with 24,892 kWh, while the lowest is in building 4, at 17,439 kWh. The highest peak load of individual buildings is in building 1, with 22.97 kWh and, after that, 22.05 kWh in building 3.
The building parameter and shape of the individual multi-family houses are similar. Thus, the installable PV capacities are in a range between 16.16 kWp in building 5 and 17.88 kWp in building 3. The roofs of the buildings are divided into two areas, roof 1 and roof 2.