Toward Zero ‐ Emission Hybrid AC/DC Power Systems with Renewable Energy Sources and Storages: A Case Study from Lake Baikal Region

: Tourism development in ecologically vulnerable areas like the lake Baikal region in Eastern Siberia is a challenging problem. To this end, the dynamical models of AC/DC hybrid isolated power system consisting of four power grids with renewable generation units and energy storage systems are proposed using the advanced methods based on deep reinforcement learning and integral equations. First, the wind and solar irradiance potential of several sites on the lake Baikal’s banks is analyzed as well as the electric load as a function of the climatic conditions. The optimal selection of the energy storage system components is supported in online mode. The approach is justified using the retrospective meteorological datasets. Such a formulation will allow us to develop a number of valuable recommendations related to the optimal control of several autonomous AC/DC hybrid power systems with different structures, equipment composition and kind of AC or DC current. Developed approach provides the valuable information at different stages of AC/DC hybrid power systems projects development with stand ‐ alone hybrid solar ‐ wind power generation systems.


Introduction
Over the past quarter century, a large number of interdisciplinary studies have been focused on the renewable sources (RES) and energy storage systems (ESS) integration in both centralized and autonomous hybrid AC/DC electric power systems. The installed capacity of renewable energy sources, including sunlight, wind, rain, tides, waves, and geothermal heat, reached 2011.33 GW according to statistics from the International Renewable Energy Agency. Moreover, over the past ten years, growth has amounted to more than 50% (1015 GW) of installed capacity [1]. The share of renewable energy sources in the global energy balance will grow from 30% to 40% by 2030 [2]. One of the main factors stimulating such a scenario for the world energy systems development is the environment protection and concern about the long-term rise of the average temperature of the Earthʹs biosphere [3]. The human activity, especially in the energy sector, causes an increase in the concentration of CO2 in the atmosphere. These challenges are prompting the global community to

Related Works
Hybrid AC/DC power systems can optimally accommodate the components and resources of future smart grids, including renewable DER, electric cars, and ESS. Many studies have examined the technical and economic advantages of combining DC and AC power in distribution systems. In [12][13][14][15], the using DC power in a distribution network improved the throughput and voltage profile of the feeders of the distribution system. To get benefits from both AC and DC, an intelligent hybrid AC/DC power system was proposed in [13]. This hybrid system has reduced the cost of battery equipment used with renewable DERs. In [15], using DC power in a distribution system led to higher throughput and lower power losses than in pure AC.
Some recent works [16,17] considered the optimal planning of hybrid AC/ DC power systems in general. For example, an algorithm proposed in [16] for planning the expansion of hybrid AC/DC transmission systems can select the optimal combination of AC/DC transmission lines from a predefined set of contenders. In this case the model has two main drawbacks: (1) the number of scenarios for the solution is predetermined; (2) power losses associated with AC/DC converters and DC lines are not taken into account in the calculations. These problems try to overcome in [18], where the authors proposed a stochastic planning model for hybrid AC/DC distribution systems, which is able to find the optimal hybrid AC/DC configuration of buses and lines in the distribution system. The objective of the planning model is to minimize the costs of installing and operating a distribution system.
The operation of various types of isolated compact power systems with small capacity is considered in recent studies (from 10 kW to 5 MW): AC, DC, and hybrid AC/DC microgrids. For the grid-connected operation, an isolated hybrid AC/DC microgrid can be connected with a distribution power network and other MGs to form a community of MGs [19]. Prior work on the community of MGs mainly focuses on energy cooperation. In this way, each MG coordinates its local resources [20][21][22][23][24] or the distribution power network [25,26], as well as other MGs [27][28][29][30][31].
Recently, for managing local resources of an MG, there were stochastic optimization models were proposed based on the deep reinforcement learning [21][22][23][24]. Such machine learning models have demonstrated effectiveness and certain advantages such as a reduction in the computational complexity of a multi-objective problem that, solving non-convex optimization problems. In [21] and [22] have introduced a deep Q-network (DQN) architectures for addressing the problem of operating an isolated MGs in a stochastic DER environment, which included PV systems, batteries, hydrogen storages, diesel generators. These approaches were empirically illustrated in the case of isolated AC/DC MGs located in Belgium and Eastern Siberia (Russia). In [23], a reinforcement-learning-based online optimal smooth control method is proposed for ESS in hybrid AC/DC MGs involving PV systems and diesel generators. The authors used neural networks to estimate the nonlinear dynamics of storage systems and to learn the optimal control input to lead a smooth charging and discharging control for ESS in MGs with unknown system parameters. In [24], the authors used a DQN algorithm for the MGs energy management taking into account the stochastic nature of input data. It was shown that employed DQN algorithm is able to select thecost-effective schedules using ESS' charging and discharging control. The performance of the DQN approach has been evaluated using real power-grid data from California Independent System Operator.
For coordination between an MG and the external distribution network, [25] proposed a hierarchical optimization approach to solve the problems of interaction between the distribution electric network and MGs. In [26], a two-level model of MG is presented for optimization problem. Using transaction-based optimization between MG and the distribution network, this model can reduce losses and improve voltage quality. In terms of the coordinative operation of the community of MGs, [27] introduced an idea of sharing resources among a community of MGs for effective reduction of amount of electricity purchased from the utility network. In [28], the reference presented a new approach for the coordinative operation of the community, which is obtained by a stochastic bi-level model. In [29] authors considered the coordinated information and strategies among the community to reduce MG operational costs. However, this work for community MGs is mainly focused on energy cooperation, while standby or emergency cooperation is not considered to overcome the uncertain DER output power. To solve these problems, cooperative energy and reserve scheduling model based on game theory was proposed in [30], which can contribute to the optimal operation of community of MGs. A community of the concept of operator MG was proposed in [31]. In this case, the actions of the benevolent planner in the process of redistributing income and expenses among members do not allow the decision reached by each member of the community to be worse than the decision that he would have achieved individually.
Notwithstanding many works, the problem of determining optimal technical characteristics for AC/DC network within a community of MGs or more powerful hybrid power grids, especially located far apart, remains an open problem. This is the main thread motivating the contribution of this paper.

Paper Contribution
For addressing these issues, the article proposes a new algorithm for operational and emergency control of a hybrid AC/DC system combining isolated grids in the community, Figure 1. The main feature of the studied energy systems is that the combined grids are located at a distance from each other, and the network connecting them is created on the basis of a minimum investment. In this regard, for optimal power exchange between subsystems, it is necessary to take into account the network equations.
The network is being formed in the following stages: 1. Isolated work. At this stage the grids are isolated. Each subsystem includes the following elements: load; RES + storage and diesel generation. The control objective for every subsystem is to minimize the power supply cost by means of optimal storage management, which corresponds to minimizing the operating time of the diesel generator. The power supply cost, as well as CO2 emissions are the highest for the isolated work. 2. Community forming. Integrating subsystems into a single community is possible only if there is a technical and economic feasibility. For example, the proximity of the transmission network, the reduction in the cost of electricity compared to diesel generation, etc. In this case, after MGs smart connection, the cost of power supply, as well as CO2 emissions will decrease.
Traditionally, isolated grids are combined by means of AC distribution network with radial structure. The inability to create loops leads to low control flexibility. For example, there is no possibility of power exchange between subsystems over the shortest distance with minimum transmission costs.
Combining subsystems using both AC and DC currents offers the following significant advantages. The DC network loops, together with the inverters coordination, significantly expand the control boundaries. Thus, in the future, with the development of converting and control technologies, AC/DC combining may turn out to be more profitable.
The following tasks of managing a hybrid AC/DC power system can be distinguished: 1. AC/DC system planning. This task is relatively new and poorly studied. Given the high degree of uncertainty due to the complexity of the structure, the presence of RES and storages, as well as a large number of owners, the selection of specific criteria for optimal planning is quite complicated. The most successful, in our opinion, attempt was made in [18]. 2. AC/DC operational and emergency control-optimal control of normal and emergency conditions for the given network structure.
Further in this paper we consider only the tasks of optimal operational and emergency control. The paper is organized as follows. Section 2 describes the proposed two-level operational and emergency control algorithms. This section also provides a brief description of the steady state model for the hybrid AC/DC network. Section 3 focuses on the case study. This section also provides the results of operational control of the converter settings during one winter day. Section 4 is for conclusions and further work.

Methodology
This section provides a methodological description of the proposed hybrid network management approaches. The two-level operational and emergency control algorithms are described in Sections 2.1 and 2.2 respectively. Sections 2.3 gives a brief description of the steady state model for the hybrid AC/DC network.

Hybrid Network Operational Control Algorithm
Based on practical experience, operational management must strike a balance between efficiency and ease of technical implementation. The inclusion of an excess amount of information in the control cycle can lead to a significant complication of the algorithm and / or its technical implementation in order to increase its effectiveness. At the same time, it is necessary that the volume of control actions is minimal, and their implementation is understandable and excludes the presence of significant uncertainty. Based on the foregoing, in this paper, we propose a two-level algorithm for optimal operational control of a hybrid network, which includes local and centralized levels, see Figure 2. It is assumed that the future community is connected to an external centralized grid, and isolated AC/DC power systems contain their own intelligent energy management systems (EMSs) for optimal energy management. The stochastic behavior of both load demands and renewable energy is considered in the proposed centralized EMS model. This input of the model presents optimal management policies for each isolated hybrid AC/DC power system as a potential participant of the future community, which was initially generated based on using an intelligent DQN-based EMS. As a consequence, each entity of such energy network community can benefit from joining the community for to the following causes: • the more efficient allocation of resources, allowing energy trading at more favorable prices; • the provision of aggregated reserve, • decrease in peak power cost.

Local Level
At the local level, the problem of stochastic optimization of storage system control is being solved in order to minimize operating costs of an isolated hybrid AC/DC network. The sub-optimization problem is formulated as a partially observable Markov decision process (MDP) in order to determine the optimal (maximum) operational revenues for each individual scenario of the network configuration. Optimally operating a hybrid AC/DC grid is considered as an agent that interacts with its environment [30]. At each time step, the agent observes a state variable , takes an action ∈ and moves into a state , . A reward signal , , is associated to the transition , , , where : → ℝ is the reward function. We then define state-action value function , associated to an optimal policy π * is used to characterize the quality of taking action at state and then acting optimally and is defined as: where , ∈ ℝ-revenues function (i.e., reward function), which define each transition generates an operational revenue for each individual scenario of the network configuration. Following [21], the deep neural network is employed to approximate , . For so-called Q-network the notation , ; Θ is used. Deep neural networks offer generalization properties that are adapted to high-dimensional sensory inputs such as time series. This algorithm combines the Q-learning algorithm using deep neural networks to represent the optimal Q-function called DQN [31]. The neural network parameters Θ can be updated using stochastic gradient descent by sampling batches of transitions a quadruple , , , and the parameters Θ are updated according to: where is a scalar step size called the learning rate. In general, a hybrid AC/DC power grid is off-grid and the goal is to maximize operational revenues. We propose to employ the concept of a virtual power plant (VPP), which is based on the suggestion of idea to aggregate the capacities of many DER (i.e., generation, storage or demand) hybrid AC / DC networks for creating a single operating profile and managing uncertainty. VPP can coordinate all DERs, as in a single agent, to integrate them into the network without jeopardizing the stability and reliability of the network, adding many other additional advantages and opportunities for consumers, prosumers and grid operator [32]. This makes VPP EMS a good candidate to justify our DQN-agent-based approach (Figure 3). The DQN-based agent of VPP EMS only has access to the current aggregating non-flexible consumption and non-steerable (i.e., renewable, PV and) generation, as well as renewable generation 24 h, 48 h ahead, forecasts for the hybrid AC/DC power grid. It has also access to the state of charge of the different storages and the aggregating capacity of steerable generators (diesel units). As a result, it must decide how to optimally use the storage systems and steerable generators. As shown in Figure 3, VPP EMS can produce control actions only for virtual storage and steerable generator while aggregating capacities of non-steerable generation and loads are only inputs of VPP EMS.
We consider various types of storage devices in order to be able to respond to both short-term and long-term fluctuations in power generation using renewable energy. The gensets, i.e., the diesel steerable generation, compensates to establish the equilibrium. Depending on the configuration of hybrid power grid, an excess of non-steerable generation and no more room for storage, the non-steerable generation is lost or can storage in hydrogen/fuel cells.
The reward function of the system corresponds to the instantaneous operational revenues at time ∈ . We used 3 quantities that are prerequisites to the definition of the reward function: electricity generation ℎ ∈ ℝ , net electricity demand ℎ ∈ ℝ and power balance, δ Wh ϵ ℝ within the power isolated AC/DC grid: (Figure 3). From the series of rewards , we get operational revenues over year y, defined as follows: where is the set of time steps belonging to year . Therefore, the optimal operation police of the hybrid AC/DC power grid optimization is determined by the maximization of [33].

Centralized Level
The centralized level algorithm controls the settings of the inverters P , P ,… P in order to minimize active power flow P to the external network (min P). As limitations, in this work, we took into account the maximum values of the inverter capacities, as well as the need for the existence of an AC/DC power flow: where is maximum capacity of the i-th inverter; | | is the absolute value of the i-th inverter setting; FAC-AC mismatch equations; FDC-DC mismatch equations. See Section 2.3 for details. As input data, the optimization algorithm receives information about the network topology, as well as the current generation/consumption level at AC and DC nodes. The input data is needed for state estimation and AC/DC power flow solving. The proposed algorithm provides optimal redistribution of active power between subsystems while minimizing network losses. At every control cycle, subsystems with power excess cover the needs of subsystems with deficiency. In case of availability, the total active power excess is transferred to the external network with minimum losses. The total active power deficiency is covered from the external network with minimum losses. The advantage of the proposed algorithm is the relative ease of implementation, which is provided by two levels structure. It is also necessary to note the possibility of taking into account the network equations. As a rule, when analyzing the aggregation of MGs, the electrical network is either not taken into account at all, or is taken into account in a simplified form. However, due to the minimization of capital investments, the network infrastructure may turn out to be the weakest link restricting the power exchange between the subsystems. In this case, the neglection of network equations can lead to unacceptable operating modes.

The Relationship between Local and Centralized Levels
The storage systems' charge-discharge process can be described using the Volterra integral models [34]. The storage system optimization should be clearly distinguished from the storage system modeling. The latter can be attacked using generalization of the recently proposed Volterra balance model [35].
Let us provide the brief introduction to the Volterra model of storage system and validate the MDP model using the approach based on the Volterra equations. The Volterra models describe the systems state evolution. The conventional ampere-hour integral model (direct problem) 0 • in [34] is considered as an inverse problem with respect to the instantaneous storage current i(τ) which is assumed positive for charge and negative for discharge. Here η(•) is the storage efficiency which can be function of SOC in turn. SOC can be expressed in % and in ampere-hours (or kWh). The Volterra integral equation is a useful tool for storage modeling , , where source function and kernel , are known and is the desired function. For a community of MGs with storage systems it is useful to employ the following system of Volterra integral equations with jump discontinuous kernels (with constrains) combining mathematically in one place Local and Centralized levels Here: , | ; 0, , 1,2, … , , Where is number of grids; functions show the proportions in which units in storage system are used in each grid. For example, if grid has three batteries used in equal proportions, then 0, /3, 2 /3, ; is number of units in storage system for -th grid; the diagonal elements of the matrix shows efficiency of storage system of each grid, the remaining elements of the matrix show at the Local Level the coefficients of power flow from storage systems of other grids; is the generation of RES; is predicted electric load of the community, \ is AC/DC power flow at the Centralized Level; is maximum speed of the charge for -th storage; , are constraints on the storage levels. The alternating power function (APF) based on is possible to find for each storage using proposed model. Such models can be employed to simulate the degradation processes in storage systems of MGs using retrospective time series of generation and load for specific location. Numerical results of proposed integral model were derived using the collocation numerical method proposed in [36,37] for determination APF and SoC will be shown on real datasets below.

Hybrid Network Emergency Control Algorithm
Conventional approaches to emergency control strategies can be divided into local and centralized. Local control is carried out by simple devices with high speed algorithms. The decentralized approach provides a high level of reliability. For instance, in the case of a slack converter loss its functions can transfer to another droop control converter. But in some cases local control can be inefficient, because the lack of system information. Centralized control is required for the effective management of complex systems. In this case, to increase a complexity of control algorithm the collection of pre-emergency parameters from EMS is required.
In this paper, we propose to use the above-described operational control algorithm as a key element for implementing emergency control in a hybrid AC/DC system. Using the same optimization procedure, the proposed centralized emergency control layer provides optimal transition to a post-emergency state. The control actions must be calculated in advance. Figure 4 shows the calculation cycle of the proposed centralized emergency control for hybrid networks. Control actions database should be produced on every cycle. A relatively small number of elements in the community provides a relatively small number of disturbances, which must be considered. In case of emergency the control actions will be instantly retrieved from the database.

Power Flow Equations of Hybrid Systems
The proposed algorithms need a steady-state model for power flow calculation for the centralized level. Usually, two types of AC/DC solvers are considered: the unified [38] when AC and DC equations solve simultaneously and the sequential [39], when AC and DC equations solve separately. In some cases, the sequential approach may lead to divergence [40], or a worse convergence [41]. In this regard, the unified method was implemented in our studies.
The rest of this section provides a brief description of the steady state equations of hybrid systems. The considered VSC model, shown in Figure 5, includes coupling transformer, reactor and high harmonics filter.

AC Side Equations
The AC side is represented by the following set of equations:

DC Side Equations
The DC side is represented by the following set of equations: where ∆ , -total loss of the i-th converter, determined according to the following equation: The transformer current , is obtained by: , , , .
The reactor (converter) current , is obtained by: where , is obtained by: In Equation (4) , is the injected DC power of the non-slack DC buses into the DC network, it is calculated as follows: where are the elements of the DC system admittance matrix.

Data Sets
Goryachinsk village located on the coast of Lake Baikal was selected for case study. The retrospective time series were taken from open sources. Namely, there are 11 years of meteorological observations for the selected location. Figure 6 shows the change in solar radiation over the past 12 years. The solar radiation has high values in the summer and reaches 180-195 kW•h/m 2 per month. Wind speed at a height of 10 m in the considered location has low values not exceeding 4 m/s. Figure  7 shows the average wind speed over the past 11 years.  The retrospective datasets of the solar radiation and wind speed can be used to model the operational parameters of solar panels and wind generators. In addition, these arrays of information can be used for short-term forecasting and building an optimal energy system management strategy. Proposed approach has been validated on real annual climatological and load datasets from Goryachinsk resort village, Lake Baikal region. The historical datasets consist of mean hourly wind speed and direct normal solar irradiance time series as well as electricity load typical profiles in the Goryachinsk village with a 50 MW total peak critical load.
Based on this real dataset, we examined isolated AC/DC grids options for four holiday resort villages featuring DERs associated with different combinations of aggregated elements: PV, wind production, batteries, hydrogen storages and diesel unit devices. The main parameters are listed in Table 1.

Local Level of Energy Grid Management
Initially, we considered the case where the hybrid AC/DC power grids are off-grid and the goals are to minimize the exploitation cost. We used the DQN architecture with the state vector as input and the Q-value for each disctretized action as a separate output. The available information at each time-step is composed of the consumption, the state of charge, the renewable production, predictions of future PV or wind production for the next 24 h and 48 h. Wind and radiance prediction was produced in a naive way by averaging past values. We assume that the agent has control of the storage devices and it must decide how to use the storage systems. The actions available at each decision step are charging, discharging and idling of each storage device in the microgrid. When the energy level from storages and from non-flexible production is not sufficient to ensure the loads are served, the steerable generators, i.e., the diesel steerable generation, compensate for the remaining energy to be supplied.
As said before, we examined four different isolated АС/DC power systems containing DERs with different initial parameters (Table 1). Two systems had diesel stations. Such a hybrid AC/DC system have VPP EMS based on DQN-agent to the optimal energy management.
After the start with a random DQN we perform the update specified in Equation (2) for each time step and, at the same time, we fill up a reproductive memory with all observations, operations and rewards with an agent that follows an ε-greedy policy subject to the policy max ∈ ⟦ , ; Θ ⟧ is taken with a probability 1-ε, and a random operations (with uniform probability over operations) is chosen with probability ε. Here ε decreases over time. At the stages of verification and tests the policy is applied with ε 0. The typical winter policies computed with minimal information available to the DQN-agent for isolated AC/DC grids are shown in Figure  8.  costs are associated with the inability to cover demand through their own local sources, which involves the purchase of energy from an external network or disconnection of consumers. The presence of a diesel generator allows to get more income (for example, for Grid 1) or reduce losses (for example, for Grid 4 in comparison with Grid 2). However, the availability of such generators in itself is an additional cost associated with fuel costs, as well as constant pollution of the surrounding area, in the form of CO2 emissions. Obviously, one of the most effective solutions is to unite isolated AC/DC grids into the single community through an electric network, which will cover the lacking weather potential of RES generation, reduce (or exclude) the diesel generators, and, most importantly, increase the revenues of each power grid through optimal energy exchange.

Load Leveling in MGs Using System of Volterra Equations
The objective of this paragraph is to demonstrate the application of the Volterra equations model for battery modeling. In this case the Volterra model introduced in (3) will be as follows where 3.5 means the limitation on the maximum power with which the storage system in i-th grid can be charged and discharged, 0% , 100% , is the disbalance between generation, losses, power flow and load to be compensated by storage system. As a result of Volterra model in Figure 9 calculated APF and SoC are shown for Grid 2.
(a) isolated AC/DC grid 2. Winter (b) isolated AC/DC grid 2. Summer  Figure 10 shows a hybrid test AC/DC community, that includes different types of renewable generation, loads and storages. The corresponding test system data is shown in Table 2. The community consists of four holiday resort villages with a 50 MW total peak critical load. Each MG is assumed to have wind and solar power plants and ESS consisting of battery and hydrogen storage system. RES and storage devices are located on the DC side, the load of household consumers is located on the AC side. Lack of generation is covered by an external network, Bus 1 is the AC slack bus. The AC network consists of double-circuit lines of various lengths with a 35 kV voltage level, the DC network is a bipolar 35 kV system. Inverter 2 is a slack inverter; inverters 3, 4 and 5 provide constant power control.  Figure 11 shows the results of operational control of the test system using the proposed two-level algorithm. At each local level, the storage control problem is solved using stochastic optimization in order to maximize the operating costs of every subsystem. The centralized level optimizes the settings of the inverters Pinv2-Pinv5 in order to minimize active power flow P to the external network. Figure 11. Operational control of the test system using the proposed two-level algorithm.

Conclusions, Discussion and Further Work
This paper has introduced a modeling framework, based on two-level optimization technique, for operational and emergency control of a hybrid AC/DC community. The proposed framework has two main features. First, it provides optimal energy management policies at the local level of every grid (or microgrid) using advanced stochastic optimization method based on deep reinforcement learning. Second, it provides the optimal redistribution of active power between subsystems by minimizing network losses. Numerical results obtained on a test case implemented in Baikal region show that the proposed framework is effective for grid community management and has high potential for CO2 reduction. The Volterra integral vector model for the grid community was evaluated on the real dataset and validated.
The disadvantage of the proposed algorithm is its inability to implement global control of energy storage, since this control is carried out at a local level without taking into account an external network. However, it should be noted that the inclusion of the possibility of global storage managing will lead to a significant complication of the algorithm, since in this case the current control of inverters should be carried out taking into account the time interval at which minimization of consumption is performed. In addition, the global management of storages will require the transfer of control actions to the local level of owners, which may be associated with technical difficulties. The global storage control topic is reserved for future work.
Further work will be focused on the excess power management, including the issuance of both the internal (storage charge) and the external network.