An Optimal Scheduling Strategy of a Microgrid with V2G Based on Deep Q-Learning

: In recent years, the access of various distributed power sources and electric vehicles (EVs) has brought more and more randomness and uncertainty to the operation and regulation of microgrids. Therefore, an optimal scheduling strategy for microgrids with EVs based on Deep Q-learning is proposed in this paper. Firstly, a vehicle-to-grid (V2G) model considering the mobility of EVs and the randomness of user charging behavior is proposed. The charging time distribution model, charging demand model, state-of-charge (SOC) dynamic model and the model of travel location are comprehensively established, thereby realizing the construction of the mathematical model of the microgrid with EVs: it can obtain the charging/discharging situation in the EV station, so as to obtain the overall output power of the EV station. Secondly, based on Deep Q-learning, the state space and action space are set up according to the actual microgrid system, and the design of the optimal scheduling reward function is completed with the goal of economy. Finally, the calculation example results show that compared with the traditional optimization algorithm, the strategy proposed in this paper has the ability of online learning and can cope with the randomness of renewable resources better. Meanwhile, the agent with experience replay ability can be trained to complete the evolution process, so as to adapt to the nonlinear inﬂuence caused by the mobility of EVs and the periodicity of user behavior, which is feasible and superior in the ﬁeld of optimal scheduling of microgrids with renewable resources and EVs.


Introduction
A microgrid is a micropower system oriented to terminal energy users such as buildings, communities, industrial parks or towns and is one of the main forms of energy in the future human society [1]. Its operation stability is usually maintained by various microsources [2,3]. In recent years, the application of distributed power sources in power systems has become more and more extensive, which brings more and more randomness and uncertainty to the operation and regulation of microgrids [4].
Nowadays, a large number of scholars at home and abroad have completed mature research on the optimization and dispatching technology of a microgrid. In [5], the uncertainty of photovoltaic power generation is modeled based on probabilistic constraints, so that an optimal scheduling method using chance-constrained programming to minimize the operating cost of microgrids is proposed. In [6], a genetic algorithm based on a memory mechanism is proposed to solve the problem of minimizing the operating cost of a microgrid. In [7], for a microgrid system combining renewable energy and traditional power generation, an energy management strategy for a hybrid thermoelectric island microgrid is proposed based on a multi-objective particle swarm optimization (MOPSO) algorithm. In [8], based on the chaotic search particle swarm optimization algorithm, with the goal of minimizing the total cost, the economic operation optimization model of the microgrid is constructed from three aspects: operating cost, environmental impact and system safety, so as to effectively reduce the operating cost of the microgrid and ensure the safety and stability of the power supply and electricity consumption. However, the above research methods are prone to falling into a local optimum when dealing with nonlinear or nonconvex problems, their computation time is long, and the generalization learning ability of the algorithm is insufficient. Therefore, in the face of the strong uncertainty of distributed power and load, the existing traditional optimization algorithms have difficulties meeting the requirements of microgrid optimal scheduling due to the above limitations.
Meanwhile, with the continuous development of new energy vehicles, the EVs industry has gradually become large-scale and market-oriented, and V2G technology has become more mature [9,10]. Therefore, the research of EVs in participating in power grid peak shaving and valley filling, and smoothing power fluctuations has also become more and more in-depth [11], but their mobility and randomness of user behavior also bring greater challenges to maintaining the economic operation of microgrids. In [12], based on the MPC algorithm, EVs are used as a mobile energy storage to participate in microgrid regulation, but the output power in the control model is constrained to a fixed value. In [13], the randomness of user travel demand is considered in the V2G model, and the EVs state of charge is modeled, but the impact of EV mobility on the controllable capacity of EV stations is not considered. In addition, the above V2G models are all modeled with the EV station as a complete output, which cannot reflect the internal charging and discharging of the EVs. In practice, the controllable capacity of the EV station would change randomly due to the user's charging behavior and the mobility of EVs.
Therefore, in order to cope with the randomness and uncertainty caused by the access of EVs and various distributed power sources to the economic dispatch of microgrids, reinforcement learning algorithms have been applied in the field of power systems [14,15]. In [16], the reinforcement learning theory is introduced to construct a mathematical model suitable for microgrid energy management, which solved the economical optimal scheduling optimization problem of a microgrid better. In [17], the model of reinforcement learning agent is applied to a microgrid system with distributed energy, which can formulate the optimal strategy for energy management and load scheduling among the three main bodies of a power source, distributed energy storage and user. In [18], facing the economic dispatch problem of microgrids with distributed energy resources, based on the reinforcement learning framework, an optimal equilibrium selection mechanism is proposed, which can improve the operation performance of microgrids in terms of economy and independence. However, the above research does not focus on the V2G modeling of EVs and cannot truly reflect the process of EVs participating in the optimal scheduling of microgrids. Meanwhile, [16][17][18] are mainly based on traditional reinforcement learning algorithms, which cannot solve the dimensionality disaster of policy sets in the face of complex environments or continuous actions. It is difficult to deal with the influence of the random change of the controllable capacity of the EV station and the uncertainty of the distributed power and load on the economic dispatch of the microgrid.
In summary, an optimal scheduling strategy for microgrids with electric vehicles based on Deep Q-learning is proposed in this paper. The main contributions are as follows: (1) A V2G mathematical model considering the mobility of EVs and the randomness of user charging behavior is proposed. The user charging time distribution model, charging demand model, EV state-of-charge (SOC) dynamic model and the model of travel location are comprehensively established, so that the agent can obtain the charging/discharging situation in an EV station to obtain the overall output power of the EV station. (2) A microgrid optimization scheduling strategy based on Deep Q-learning is proposed.
The strategy has the ability of online learning and can cope with the randomness of renewable resources better. Meanwhile, the agent with experience replay ability can be trained to complete the evolution process, so as to adapt to the nonlinear influence caused by the mobility of EVs and the periodicity of user behavior, which is feasible and superior in the optimal scheduling of microgrids with renewable resources and EVs.
The remainder of this paper is organized as follows: In Section 2, the mathematical model construction of a microgrid with EVs is established. The microgrid dispatch model based on Deep Q-learning is introduced in Section 3. The simulation results are presented and analyzed in Section 4, and the conclusions are summarized in Section 5.

The Mathematical Model Construction of Microgrid with EVs
The high penetration of renewable energy into the power grid may affect a series of problems such as the balance of supply and demand in the system and the stable operation of the power grid. Additionally, EVs are able to support large-scale integration of renewable energy by absorbing excess energy and returning it to the grid when needed. V2G technology can use the mobile energy storage characteristics of EVs to reasonably adjust their charging/discharging behavior, thereby alleviating the impact of load fluctuations [19,20]. Therefore, based on the randomness of user behavior and the mobility of EVs, a charging/discharging model for EVs is constructed, and an optimal scheduling model for microgrids with EVs is established.

The V2G Model of EVs
Firstly, it is assumed that the electric vehicle is fully charged before traveling, and the battery power consumption of the electric vehicle has a linear relationship with the daily mileage [21]. That is, after obtaining the probability distribution of the daily mileage of the electric vehicle, the probability distribution of the battery state of charge SOC 0 of the electric vehicle when it returns to the charging station can be obtained. The return time of different EVs within a day and the corresponding charging time are also important components of the V2G model of the EVs station.
In the existing EV model, due to the regular travel behavior of users, the arrival time and location of electric vehicles are relatively fixed. However, in the actual situation, EVs have mobility due to the randomness of real-time road network. For example, in the case of traffic congestion, users will adjust charging route decision, which will affect the arrival time and location of EVs and then affect the power consumption of EVs when they enter the station.
Uncertain influences such as road network congestion are closely related to the type of user distribution. For example, electric vehicles that are distributed in commercial areas for a long time are more likely to experience congestion during their journey, and electric vehicles in public areas have higher driving speed and lower unit power consumption. Therefore, the main travel behaviors and the proportion of each activity trip of electric vehicles can be obtained in this paper, as shown in Table 1 and Figure 1. Among them, various distribution areas can be divided into residential areas, commercial areas and public areas, abbreviated as R, C and P, respectively.  Therefore, the daily mileage obeys a log-normal distribution L ∼ Log − N µ L , σ L 2 , and its probability density function is shown in Formula (1): where µ L and σ L represent the mean and variance, respectively, which is determined by different types of user behavior. In addition, it is worth noting that according to the behavior data of electric vehicle users, the vehicle owner charges every ε days on average, so the probability density function of the total mileage of the electric vehicle when it enters the charging station can be obtained as shown in (2) and obeys a log-normal distribution Secondly, it can be assumed that the EV returns at time t, obeying the normal distribution t 0 ∼ Log − N µ s , σ s 2 , and its probability density function is Formula (3).
where µ s and σ s represent the mean and variance, respectively, which is determined by different types of user behavior as well. Furthermore, it can be assumed that the charging power of the EVs after entering the charging station is constant. When the state of charge of the EV battery reaches SOC m , the driving process expected by the user after the EV leaves the charging station can be satisfied. Therefore, according to the daily driving mileage, the time for the battery capacity to be charged to SOC m after the EV enters the station can be calculated as T c : where L is the daily driving distance of the EVs, P c is the charging power and Q 100 is the power consumption per 100 km, W total is the full power of the EVs, and W m is the power of the EVs when the state of charge is at SOC m . The duration of EV stagnation in the electric vehicle station can be defined as ∆T, and the departure time is defined as T leave . It is easy to know that ∆T ≥ T c . Therefore, ∆T and T leave satisfy the following formula.
where σ T is a positive random number, and its value would be selected according to the user's travel habits on weekdays; T enter is the EV inbound charging time. As shown in Figure 2, when the state of charge reaches SOC m , or the state of charge is greater than SOC m when entering the station, the EVs will be able to participate in the load distribution optimization scheduling process of the microgrid. That is, it can be discharged when the microgrid encounters peak power consumption, and this discharge process will not make the EV power lower than SOC m . When the state of charge of the EV reaches SOC max , the EVs will no longer be charged to ensure battery life. At this time, the EVs will automatically stop charging (maintain SOC max ) or discharge. To sum up, different EVs will have different entry time T enter,i and the necessary charging time ∆T i and will automatically participate in scheduling or continue charging according to the load status of the microgrid after the charge reaches SOC m , and leave the charging station when T leave,i . The specific scheduling process is shown in Figure 3: EV1 enters the station at time T 1 , and its state of charge is less than SOC m at this time, so it enters the state of charge and participates in the scheduling of feeding at time T 3 until T 4 , at which time the state of charge of the vehicle is greater than SOC m ; EV2 enters the station at time T 2 , and its state of charge is greater than SOC m at this time, so it can immediately participate in dispatching and distribution when the microgrid is at peak load power consumption until T 4 ; EV3 is extremely low in battery power when entering the station, so it is always kept charged, and it is always kept charged until time T 5 . Therefore, it can be assumed that there are n EVs in the station at time t, including i vehicle in a state of nonchargeable and dischargeable (SOC = SOC max ), including j vehicle in a state of rechargeable and non-dischargeable (SOC < SOC m ), and the remaining vehicles in both charging and discharging states (SOC m < SOC < SOC max ). It can be obtained that at time t, the boundary of the overall charging power of the EV station is shown in (7): where P + EV (t) and P − EV (t) are the boundaries of the interactive power output by the charging station. ∆P + EV (t) and ∆P − EV (t) represent the interactive power between the electric vehicle charging station and the microgrid at time t, which are determined by the agent: the agent can select the optimal action according to the actual situation and economic benefits of the microgrid and obtain the charging/discharging situation in the EV station at the current moment, so as to obtain the overall output power of the EV station, as shown in (8):

The Optimal Dispatching Model of Microgrid
The microgrid structure considered in this paper is shown in Figure 4. The microgrid consists of wind turbines, photovoltaics, micro-gas turbines, an EVs station and other units. Therefore, the optimal dispatching model of a microgrid including EVs is constructed. Where P L is the load disturbance power, P wt is the wind disturbance power, P pv is the photovoltaic power generation power, P MT is the power variation of MT, ∆P EV is the power variation of EVs and P e is the power between the microgrid and the large grid.

Objective Function
Considering the randomness of wind and solar loads, an optimization model is established with the goal of minimizing the expected total economic operating cost during the optimization period. Its objective function is shown in (9): where C gas represents the cost of purchasing natural gas for the micro-gas turbine, C e represents the cost of purchasing and selling electricity generated by the interaction between the microgrid and the power grid and C EV represents the cost of purchasing and selling electricity generated by the charging and discharging of EVs.
where P MT represents the output of the micro-gas turbine at time t, η represents the conversion efficiency of the micro-gas turbine, q NG represents the low calorific value of natural gas and c gas represents the gas purchase cost coefficient of the micro-gas turbine. P buy,e , P sell,e represent the power purchase and sale between the microgrid and the large grid, e b and e s represent the cost coefficient of purchasing and selling electricity. ∆P EV + and ∆P EV − represent the charging power and discharging power of the electric vehicle charging station, and c b and c s represent the cost coefficient of charging and discharging.

Constraints
(1) Power Balance Constraints: P wt (t) + P pv (t) + P MT (t) + P buy,ev (t) + P buy,e (t) = L(t) + P sell,ev (t) + P sell,e (t) (11) where P wt (t), P pv (t) represent the output power of winds and photovoltaics in the t period, and L(t) represents the load in the t period.
(2) Micro gas turbine operating constraints: where P MT represents the output power of the micro-gas turbine, R d and R u represent the downward and upward ramp rates of the micro-gas turbine and P MT,min , P MT,max represent the lower and upper output limits of the micro-gas turbine. (3) Grid interaction power constraints: (4) EV station constraints: The constraints on the overall charge/discharge power of the EVs station have been given in Section 2.1, as shown in (7).
In addition, it can be considered that the main function of the EV station is to provide charging services for the users, and the priority of ensuring that the user's EVs is sufficient is the highest. Therefore, the charging and discharging power constraints of each EV in the EVs station can be obtained: Meanwhile, the interest relationship between EV users, microgrid operators and large grids is considered, and the price constraints can be obtained, as shown in Figure 5:

Theory of Reinforcement Learning Algorithms
Reinforcement learning RL is a learning algorithm that maps from environmental states to actions, and its goal is to maximize the cumulative reward of an agent during trial and error with a given environment [22,23].
To achieve these functions, the reinforcement learning framework consists of agents, which are able to take certain actions a t based on the current state s t , as shown in Figure 6. After choosing an action at time t, the agent receives a scalar reward r t+1 and finds itself in a new state s t+1 , which depends on the current state and the chosen action. As shown in Figure 7, the Markov decision process satisfies the Markov property and is the basic formalism of reinforcement learning, which can be described as: P(s t+1 |s 0 , a 0 , · · · , s t , a t ) = P(s t+1 |s t , a t ) (16) where P is the state transition probability. At each epoch, the agent takes actions to change its state in the environment and provide rewards. To further process the reward value, a value function and optimal policy are proposed. To maximize the long-term cumulative reward after the current time t, for a finite time horizon ending at time t, the payoff R t is shown in (17): where the Discount factor γ∈[0, 1], and γ can take 1 only in intermittent MDP.
To find the optimal policy, some algorithms are based on a value function V(s), which represents how beneficial the agent is to reaching a given state s. This function depends on the agent's actual policy π: Similarly, the action-value function Q expresses the value of taking action a in state s under policy π as: In the Q-learning algorithm, the Q-function can be expressed in an iterative form by the Bellman equation: The optimal policy π * is the policy that yields the largest cumulative reward in the long run: At this point, the optimal value function and action value function are shown in (22):

Design of Optimal Scheduling Strategy for Microgrid Based on Deep Q-Learning
Deep Q-learning has the advantage of being suitable for solving optimal decisionmaking problems with uncertain factors and can be applied to solve the optimal scheduling problem of a microgrid considering intermittent renewable energy generation and charging uncertainty of EV users. Therefore, the established mathematical model of the optimal scheduling problem for microgrids is transformed into a Deep Q-learning framework in this section.
The basic components of reinforcement learning include: the state space S representing the environment, the action space A representing the action of the agent, and the reward function r for training the agent.
(1) State space: The state variables of the microgrid system include user electrical load demand, photovoltaic power generation power, wind turbine power generation power, charging and discharging power capability of EV stations and dispatching time period. Therefore, the state space can be expressed as: (2) Action space: After the agent observes the state characteristics of the environmental system, it generates actions based on the agent's own strategy π Actions in the microgrid model with EVs can be represented by the output power of the micro-gas turbine, the interaction power between the EVs station and the microgrid and the power purchased and sold between the microgrid and the grid. Therefore, the action space can be expressed as: In addition, when the power of MT and EV is known, the interaction power between the microgrid and the grid can be calculated by the power balance constraint. Therefore, the action space can be simplified as: (3) Reward function: In the optimal scheduling model of the microgrid proposed in this paper, the goal is to minimize the overall operating cost of the system, which includes the cost of purchasing and selling electricity between the microgrid and the grid, the cost of purchasing and selling electricity between the EVs station and the grid, and operating costs of micro-combustion engines. Therefore, in this paper, the minimization problem is transformed into the form of reward value maximization under the reinforcement learning framework, and the reward function expression of the agent can be expressed as: In addition, when the agent is in the early stage of exploration, the policy model is not yet mature, and some actions may not meet the constraints. Therefore, it is necessary to set up an early termination mechanism to construct a penalty term to improve the training speed. From the action space (25), it can be known that the interactive power between the microgrid and the grid is solved by derivation of the balance constraint. Therefore, there is a problem that the interactive power crosses the line and the constraints cannot be satisfied. To sum up, the reward function is constructed by stacking penalty terms, as shown in Equation (27): where f d is the penalty term coefficient.

Neural Network Structure
In the optimization model of this paper, the random constraints of electric vehicles and the output of new energy are strongly nonlinear data. Deep Q-learning combines deep neural network and reinforcement learning, so it has the ability to effectively process large-scale data: agent training can be completed through a large amount of data, so as to output real-time decisions according to real-time state variables and obtain the optimal scheduling scheme. Therefore, this paper takes the state vector S as the input sequence through the neural network and finally gets the approximated Q value in the output layer. The corresponding network structure is shown in Figure 8, which has h layers of hidden layers, and each hidden layer is composed of u neurons, and the specific value of the (h, u) parameter is affected by the actual calculation example. In the optimization model of this paper, the neural network has a total of four hidden layers, and the ReLU (Rectified Linear Unit) function is used as the activation function.

The Flow Diagram of Deep Q-learning Algorithm
The dispatch strategy of this paper is carried out in the following steps: First, determine the state set of the system as S. Furthermore, the action space can be defined as A.
Second, the parameters are adjusted according to the actual computing instance, and the values of the reward function coefficients and hyperparameters are obtained.
Finally, the agent is trained, and after convergence, the known information of the microgrid is input to the agent so as to obtain the optimal dispatch scheduling result of the next day.
In summary, after applying the deep neural network to Q-learning, Deep Q-learning introduces the experience playback mechanism and the freezing parameter mechanism in order to reduce the correlation between samples and improve the stability of training. Therefore, combined with the application scenarios of this paper, the training process of Deep Q-learning in the microgrid with EVs can be obtained as shown in Figure 9.

Simulation Results
In order to verify the effectiveness of the economic dispatch strategy for microgrids with electric vehicles based on Deep Q-learning proposed in this paper, the microgrid system with EVs shown in Figure 4 was used as an example for simulation research. The microgrid system includes a wind turbine WT, a photovoltaic PV, a micro-gas turbine MT and an electric vehicle charging station. The equipment abbreviations and working parameters in the system are shown in Table 2. During the operation of the system, the range of the interactive power between the system and the grid is [−1000, 1000] kW, and the PV and WT output according to the real-time maximum power generation. In this paper, the purchase cost of natural gas is 0.059 USD/kWh, the electricity purchase price of the system is 0.074 USD/kWh and the electricity selling price is 0.044 USD/kWh. Therefore, the hyperparameter settings of the Deep Q-learning agent can be obtained as follows: the discount factor γ is 0.9, the data sampling size is 256, the experience pool size is 10 6 , the network parameter learning rate α is 0.0001 and the Adam optimizer is used to update the network weights. The iterative training times are 5 × 10 5 times. In this paper, Python software and the computing unit of CPUi7-10700 are used in the simulation experiment platform to construct and verify the simulation model.

Case1: Analysis of Electric Vehicle Mobility and User Behavior Habits
From the model in Section 2.1, it can be seen that the controllable capacity of the EV station is affected by the mobility of EVs and the randomness of the user. Take the model constructed in this paper to generate the distribution of EVs in the microgrid on a certain day as an example, as shown in Figure 10. It can be seen that the number of EVs (maximum controllable capacity) in the EV station during T1, T2 and T3 is quite different. Therefore, the ability of EV stations to participate in microgrid regulation will also show obvious time peaks and valleys during the day, which is closely related to the living habits of the user group: Between 8:00 and 12:00, a small number of EVs gradually entered the station. After 12:00, the number of EVs entering the station began to increase rapidly and reached saturation at 23:00. In addition, after 24:00, the power of most electric vehicles has exceeded SOC m . At this time, the controllable capacity of the station reaches its peak and starts to gradually decrease at 4:00, and a large number of EVs leave the charging station at 8:00, which makes the controllable capacity of the station plummeted.
The above situation has brought a strong nonlinear influence to the microgrid dispatching process, which makes the traditional algorithm without evolution ability unable to adapt, thus posing a challenge to the dispatching of the power grid. Therefore, in order to better reflect the superiority of the Deep Q-learning algorithm in the dispatching of microgrids with EVs, the PSO algorithm will be introduced in this paper as a comparison.

Case2: Energy Dispatching Results of a Microgrid
After the Deep Q-learning agent completes the training process, it accumulates enough experience to be able to complete the intelligent scheduling process of the microgrid [24]. The optimization comparison results of using Deep Q-learning and PSO algorithms solving the same scheduling day scenario are shown in Figures 11 and 12 and Table 3. Among them, the specific data of typical scheduling day scenario are shown in the yellow and pink histograms and blue lines in Figures 11 and 12. It can be seen that: (1) Between 8:00-12:00, because most of the vehicles were stranded outside the station, the output of the EV station was small. Between 16:00-22:00, the EV station mainly acted as the load. At this time, most of the EVs were charged in the station. After 22:00, the controllable capacity of the EV station gradually reached its peak value and could be discharged to participate in dispatching.
(2) Compared with the PSO algorithm, the Deep Q-learning algorithm had the online learning ability and could adapt better to the staged mutation of the capacity of the EV station caused by the mobility of EVs and the randomness of user behavior based on the experience accumulated in the training process, which significantly enhanced the robustness and adaptability of the microgrid.
(3) The total operating cost of the microgrid under the Deep Q-learning algorithm was 801.07 USD, and the calculation time was 0.5 s. The total operating cost of the microgrid under the PSO algorithm was 814.57 USD, and the calculation time was 7 min 23 s. In detail, the natural gas cost of the microgrid under the PSO algorithm was 825.34 USD, which was smaller than the 897.7 USD obtained by the Deep Q-learning algorithm, because the total output of the MT in the solution result of the PSO algorithm was smaller than that of the Deep Q-learning. In fact, the unit cost of power generation of MT is the lowest, that is, the PSO algorithm needed to make up for the shortage of electricity by purchasing a large amount of electricity from the grid: the electricity purchase cost of the microgrid under the PSO algorithm was 159.07 USD, which was much greater than that of the Deep Q-learning algorithm.
(4) As shown in Figure 11, the PSO could not adapt to the nonlinear effects brought about by changes in the constraints of EVs, and its scheduling results were mostly in the charging state. Although a small number of EVs participate in the discharge, the charging station as a whole cannot discharge and is in a continuous charging state. As shown in Figure 12, Deep Q-learning could adopt the most economical charging and discharging strategy under the constraint conditions, could discharge properly to reduce the power supply pressure when the load was high, and acted as a power source at night to achieve economy.
In summary, it can be seen that the Deep Q-learning algorithm was better than the PSO algorithm in all aspects. Among them, the advantage in flexible handling of randomness of EV stations is particularly obvious.

Conclusions
In summary, an optimal scheduling model for microgrids with electric vehicles based on Deep Q-learning is proposed in this paper. Through simulation analysis under various scenarios, the following conclusions are drawn: • As a mobile energy storage component with V2 G capability, EVs can participate well in the dispatching control of the microgrid, providing a more flexible dispatching scheme for the stable operation of the microgrid. • Compared with traditional algorithms, Deep Q-learning with online learning ability can better adapt to the strong nonlinear effects caused by the mobility of EVs, randomness of user behavior and renewable resources based on the experience accumulated in the training process. The cost of the microgrid under Deep Q-learning was 801.07 USD, and the calculation time was 0.05 s, while the total operating cost of the microgrid under the PSO algorithm was 814.57 USD, and the calculation time was 7 min 23 s. Therefore, Deep Q-learning was better than the PSO algorithm in all aspects, such as operating total costs, micro-turbine output, V2G interaction situation, grid-connected costs and operating time, which is explained in great detail in Section 4.2.